Page 1 of 1

OT - Extension to read end-of-line encoding

Posted: 2012 May 21, 15:59
by Brig
Howdy:

Does anyone know if there's an extension out there that can determine if a text file is DOS, Unix, or Mac? I'm finding that such a thing would be really helpful to me. Ideally, the information would appear in the Infobar.

Thanks,
Brig

Posted: 2012 May 21, 16:55
by nikos
the closest i can think of is editor2: if you hit F3 on a text file you can see its encoding in the editor2 statusbar

Posted: 2012 May 21, 17:08
by Brig
Yeah, I know about that, thanks. I just thought it would be nice to see it in x2, without having to open the files.

Posted: 2012 May 21, 17:17
by Kilmatead
Brig wrote:I just thought it would be nice to see it in x2, without having to open the files.
Therein lies the irony. :wink:

At first glance, you'd think that this should be a common-enough thing that there'd be plenty of ways to determine an encoding type.  As it turns out, since text files aren't required to use BOMs or any standard headers to predefine text data-type, determining something like this accurately is not as simple as you'd hope.  Basically, without the presence of a BOM (Byte Order Mark) the entire file would need to be read and broken down into identifiable international character set (ISO, OEM, etc) bit patterns which don't appear to be set in a universal "10 Commandments written in Stone" rulebook about these things - so in a some scenarios, it's not even possible to do it 100% reliably.

That said, it would be easy to write a script to immediately identify ANSI, UTF8, UTF16 & "endianness" - but at the end of the day that doesn't really tell you much, without analysing the actual contextual contents of the file itself (for example, are you a Russian spy using Windows to create backwards cyphers to decode Cyrillic characters into a recipe that your turncoat mate in the Indian subcontinent could easily convert (via a Linux server) into a particularly romantic type of sushi that even a Hindu could enjoy while using her 1980's Macintosh in an attempt to undermine further the already unstable government of the land?).

Ironically, that's why most people just suggest the logical path of "open it in Notepad++ and see what it tells you" :shrug:.

So, in the long way, I don't think such an extension exists, due to the requirements of reading entire files (not just simple headers) to determine type.  But, like I said, if you only want to know ANSI, UTF8, etc, that can be done in a script without much thought - maybe displayed using right-click interrogation methods that the Geneva Convention outlawed years ago.  But hey, that's an everyday thing for a Russian spy such as yourself, no?

Posted: 2012 May 21, 17:27
by Brig
Thanks K. Nice explanation.

But what about the preview pane? Doesn't it have to do some reading?

Posted: 2012 May 21, 17:37
by Kilmatead
Yes, it reads the file too - but the rendering is not done by x2 itself, except the plain Draft one, which, by definition, means it doesn't do anything fancy if the necessary info isn't provided up-front.  It doesn't do anything heroic.

Even then Nikos would have to add a status bar or something similar to the pane to display the information, and you know how difficult it already is to get him to cut the grass and take out the rubbish on Sunday evenings.  People just don't have the energy anymore once they do their bit for overpopulating the planet with their screaming progeny.  However, it's not a bad idea. :thumbup:

Posted: 2012 May 21, 17:50
by Brig
Interesting. I'm happy leaving it to Nikos's discretion and energy level. I fully appreciate the life-force drain that one's progeny represent. I can tell you it gets better. Until, that is, your firstborn becomes a teenager. My son is thirteen, and my wife and I have thus returned to the salt mines of parenting.