History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: IDEA-18447
Type: Bug Bug
Status: Open Open
Assignee: Alexey Kudravtsev
Reporter: Amnon I. Govrin
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
IDEA: Feedback

Saving unicode html files breaks the encoding

Created: 17 Jun 08 17:02   Updated: 17 Jun 08 23:05
Component/s: HTML.Editing

File Attachments: 1. HTML File bad.html (4 kb)
2. HTML File good.html (4 kb)

Environment: Windows Vista Ultimate SP1 English

Build: 7,860
Severity: High


 Description  « Hide
Attached are 2 files - good.html and bad.html.
bad.html was created by opening good.html, changing something, then changing it back (to make the file saveable but identical) and saving it.
The result in this very simple case is that 2 bytes (FE FF) are added after the 2nd byte of the file to the start and every 2 bytes are reversed (e.g. "00 C0" instead of "C0 00").
The file seems fine in IntelliJ until it is closed and reopened. The file is effectively not usable unless manually reversing the changes in a binary editor (I used Visual Studio 2005).
In a less controlled environment, before I understood what was happening, I had a file that had multiple occurrences of "FE FF" added to the file which I had to manually remove, the above "endian reversal" problem I described above and also end of line problems, where every "0D 00 0A 00" became (after the fixes above) "0D 00 0D 0A 00" which again throws IE and notepad (among others) and IntelliJ after closing and reopening the file to the wrong encoding mode (shows Japanese/Chinese characters).
This seems to happen only on html files.
I tried the released 7.0.3 (7757) and EAP 7860. I'm having too many issues with Diana at this point to invest too much in using it.
This renders working on the gadget in IntelliJ impossible. Currently working on the gadget is priority 1 of my job.
I am working on a Windows sidebar gadget and the file is inside the "C:\Program Files\Windows Sidebar\Gadgets\MyGadget.Gadget" directory on Vista.
This probably has more triggers than just any html file, but happens only by saving the file after a change in IntelliJ and not from other tools.
Default IntelliJ encoding is set to UTF-8.

 All   Comments   Work Log   Change History      Sort Order:
Amnon I. Govrin - 17 Jun 08 23:01
After further investigation it seems as changing the encoding in the html meta data section to UTF-8 fixes the problem.
I still can't understand this, as editing the file anywhere else (I tried Notepad and Visual Studio 2005) don't cause problems.
Is IntelliJ too smart for its own good about file encoding? Does it look at that data and get confused?
Anyway, this is definitely not extreme now as we can now work.

Amnon I. Govrin - 17 Jun 08 23:05
Changed to High as we found a workaround as described in the comment.