blog: program your way beyond MAX_PATH

Discussion & Support for xplorer² professional

Moderators: fgagnon, nikos, Site Mods

Post Reply
User avatar
nikos
Site Admin
Site Admin
Posts: 15791
Joined: 2002 Feb 07, 15:57
Location: UK
Contact:

blog: program your way beyond MAX_PATH

Post by nikos »

this is the comments area for the (delayed) blog post found at:
http://zabkat.com/blog/max-path-program ... okbook.htm
snemarch
Bronze Member
Bronze Member
Posts: 94
Joined: 2008 Jan 15, 10:08

Re: blog: program your way beyond MAX_PATH

Post by snemarch »

This hints a deeper (no pun intended) problem in windows kernel,
No, it hints at the Win32 subsystem having lots of MAX_PATH dependencies, not the kernel.
On windows 10 you can have super deep paths even on USB sticks (FAT32 !?) which is surprising
Did you verify the FS was FAT32? There's nothing stopping you from putting NTFS or exFAT on a USB stick. Also, FAT32 with LFNs has a max path length of 255 UCS-2 characters, if memory serves me right - never looked at exFAT.
The old INI file API (GetPrivateProfileString et al) is mysteriously out of bounds. It doesn't make sense because all other low level file access works. Pain in the @$$
GetPrivateProfileString and friends do file parsing, so they're by definition high-level, not low-level. I've never disassembled it, but since lpFileName can be relative to the Windows folder, it's safe to assume it has buffers and string copying. It's a crusty old API, and since all features start out at minus 100 points, it's easy to see why it's not prime candidate for rewriting.
CreateDirectory (without \\?\ prefix) fails even for paths a little below MAX_PATH
It's not just "a little below", the rules are pretty well defined - TL;DR snippet: "For example, the maximum path on drive D is "D:\some 256-character path string<NUL>" - it's called MAX_PATH, after all, not MAX_DIRECTORY_NAME_LENGTH :-)
User avatar
nikos
Site Admin
Site Admin
Posts: 15791
Joined: 2002 Feb 07, 15:57
Location: UK
Contact:

Re: blog: program your way beyond MAX_PATH

Post by nikos »

too much of tl;dr; and little actual experimentation before stating opinions in the strongest terms, you must be the reincarnation of Aristotle ;)
Kilmatead
Platinum Member
Platinum Member
Posts: 4578
Joined: 2008 Sep 30, 06:52
Location: Dublin

Re: blog: program your way beyond MAX_PATH

Post by Kilmatead »

In his defence (and speaking as one who's done quite a bit of actual experimentation :sad:), the Win32API is littered with older functions which don't support even the \\?\ workaround, forcing the bereft coder to spin their own versions of many handy things (such as most of the "Pathing" stuff like PathRemoveFileSpecW, ILCreateFromPathW, PathIsPrefixW, ad nauseam).

I mean, come on... your own example gives wvsprintf in lieu of _vsnwprintf (which gives better results from some format masks anyway). Don't get me wrong - I'm not a big fan of MS's attempt at making the CRT "safe" for idiot-coders, so most of their spaghetti-code "replacement" functions are just silly (and obvious precursors of convincing the unwary dev to think .NET is a better way), but they have washed their hands with the PrivateProfileString stuff... one gets the idea that the large disclaimer at the top which states "This function is provided only for compatibility with 16-bit Windows-based applications" has been there for... um... 20 years? :D It does, at least, do any ANSI/Multibyte conversions itself automatically, but that's small consolation...

Sometimes it's easier to just parse the bloody bastard yourself. Like, dude, in x2's INI you can have up to 700 lines of sections and keys... did you actually use Get/Write PPS's for each one of those manually? (You didn't, did you?) That would be... crazy! :shock: Never mind the redundant disc/access overhead alone (unless they cache it all on first access, but there's no way to tell if that's the case)... great for small stuff, but unwieldy for major-scale...

(On an unrelated note, I was shocked a few weeks ago when I discovered just how unhelpfully primitive the ICC_TAB_CLASSES stuff is... I mean, Holy bits and bytes Batman! I'm a big fan of doing stuff the hard way, but subclassing everything in sight yourself just because "they" obviously left the office early that day? Yikes!)
User avatar
nikos
Site Admin
Site Admin
Posts: 15791
Joined: 2002 Feb 07, 15:57
Location: UK
Contact:

Re: blog: program your way beyond MAX_PATH

Post by nikos »

that's why I say that saving settings in INI is "much slower" and to be avoided unless you have to (portable)
Kilmatead
Platinum Member
Platinum Member
Posts: 4578
Joined: 2008 Sep 30, 06:52
Location: Dublin

Re: blog: program your way beyond MAX_PATH

Post by Kilmatead »

But... but... <affects flabbergasted pose of disbelief> didn't you go insane doing the copy/paste/edit dance a million times over? Do you have a cheap supply of benzodiazepines that we don't know about? :D

(Heck, in the x2PluginManager thingy I didn't use a single PPS at all! I considered it to be too... too... pedestrian/plebeian.)

Of course, one also wonders if penalising the registry user with two superfluous PathFileExistsW calls is polite either... :wink:
snemarch
Bronze Member
Bronze Member
Posts: 94
Joined: 2008 Jan 15, 10:08

Re: blog: program your way beyond MAX_PATH

Post by snemarch »

nikos wrote: 2017 Jan 12, 07:02too much of tl;dr; and little actual experimentation before stating opinions in the strongest terms, you must be the reincarnation of Aristotle ;)
Ah, an ad hominem instead of dealing with the contents? ;-)

I've done a fair amount of experimentation myself, and the docs haven't always been right - in this case the TL;DR is obviously the simple & short version, though, not the entire picture... as it often is with the abbreviated TL;DR versions.
Kilmatead wrote: 2017 Jan 12, 07:34I'm not a big fan of MS's attempt at making the CRT "safe" for idiot-coders
The only "idiot-coders" are the one who designed the multitude of unsafe libc functions in the first place - but I wouldn't go as far as calling them idiots. The world was a different place back then ("Smashing The Stack For Fun And Profit" is from 1996, some 30 after the C language was invented). The safe CRT stuff has nothing to do with "pandering to idiotic .NET programmers", but everything to do with providing functions that are possible to use correctly - without requiring ugly workarounds that end up being slower anyway.
Kilmatead
Platinum Member
Platinum Member
Posts: 4578
Joined: 2008 Sep 30, 06:52
Location: Dublin

Re: blog: program your way beyond MAX_PATH

Post by Kilmatead »

snemarch wrote: 2017 Jan 12, 09:35The safe CRT stuff has nothing to do with "pandering to idiotic .NET programmers", but everything to do with providing functions that are possible to use correctly - without requiring ugly workarounds that end up being slower anyway.
That's not what I meant - I was referring to the liberal splatterings of warnings which MS litter MSDN with regarding "Do not use", and "These functions are unsafe", which makes it sound as if the original CRT was a pile of poo and only MS can come in to save to populace from themselves. There is nothing inherently unsafe about any of those functions (ok, the original strtok was unsound in threaded environments, but that's long gone now). One of the spirits and blessings of C which more modern languages have lost is the freedom to hurt yourself (there is a fine line between mindful precision and suicide, if you know what I mean) if that's what you want to do. Sure, the likelihood of stumbling into unintended rabbit-holes and quagmires is increased proportionally, but once upon a time programmers learned from the danger - now they "learn" by people just telling them what they shouldn't do, and coating everything in a nice safe layer of slime which is the legacy of managed code. If you use the "original unsafe" functions properly in the first place, they cannot be called unsafe, merely a bit volatile, as the best of ladies always are. :wink:

We will (I feel) disagree on the spirit of content here - I could care less about "productivity" and all the rot that goes along with the white-collar criminal approach to programming; while that aspect is all very entertaining for generating revenue and blah blah blah, it lacks the certain élan vital which is (in my opinion) the whole purpose behind this stuff anyway. But then again, I did get expelled from the first college I was in for computer hackery (a popular yet unknown thing in the late 80's), so my perspective is perhaps tainted with a charm all its own. :D (Which inspired me to switch to an even more economically useless major, but that's a different story. The only part of me that regrets it is my knees, which get worse and worse every year, I fear. So it goes.)
snemarch
Bronze Member
Bronze Member
Posts: 94
Joined: 2008 Jan 15, 10:08

Re: blog: program your way beyond MAX_PATH

Post by snemarch »

Kilmatead, there is a pretty large number of functions in libc that are only safe to use in special situations, which cannot be used safely in the general way (which people often do use them) - so the warnings in MSDN are entirely appropriate. A number of these functions didn't have safe alternatives, so it's entirely appropriate to have introduced the safe alternatives rather than having every programmer A) reinventing the wheel or B) writing unsafe code.

And there certainly are parts of libc that is poo, but that doesn't mean the alternative has to be managed languages - you can do safe zero-overhead code in C++. (And you're indeed right that I disagree with you, I don't find managed code to be slime - different tools are good for different jobs).

I grew up on C/C++ and assembly and have done software security from both sides of the fence, and these days I routinely fix issues in other people's code. Problems usually caused by people who think they're clever, but end up hurting security, performance and readability as a consequence.

Anyway, back on track: if you're dealing with paths and filenames, do yourself a favor: stop working on character buffers, move up an abstraction level and find/design yourself a Path class. It centralizes where you have to deal with these issues, reduces the risk of bugs, and makes your code clearer.

It's a big task to redesign an existing system, though, and to achieve zero-overhead you will need to engage your brain slightly with regards to how you interface with external APIs (especially that of Win32 - its age shows). But engaging ones brain is what we programmers are supposed to do :-)
Kilmatead
Platinum Member
Platinum Member
Posts: 4578
Joined: 2008 Sep 30, 06:52
Location: Dublin

Re: blog: program your way beyond MAX_PATH

Post by Kilmatead »

We'll just have to agree to disagree on most of that :D, as (if you ask MS) they'd suggest that anything based on character buffers is inherently unsafe on a purely philosophical level, which is just demonstrably untrue and based more on (as you suggest) the code people do produce, rather the code they could produce. Sure, it's awkward as heck to wrap one's brain around using the basic functions in a properly tight fashion (the poor programmer has to be hyper aware and vigilant of everything), which in this day and age is not the most popular of philosophies to be espousing. But that does not a function unsafe make. :wink:

For example, Great Britain made a great show of itself yesterday by declaring themselves the first country to recognise Parcour as a sport. (I'm old enough to think first of parkour flooring rather than some amusing urban runaround, but that's just me, and I digress.) We also shan't quibble on the definition of what a sport is either, as I'm the world's biggest Snooker fan but I'm also the first to deny it's any class of "sport" any more than Darts or Poker are. The primary result of GB's declaration however is that it can now be legally taught/practised in schools, as part of normal exercise curriculum.

Now it's pretty obvious to even the most casual observer that legally encouraging kids to imitate the dumbest stunts you can imagine on YouTube is a blatantly unsafe concept. In fact, it's probably the very definition of "unsafe" in the traditional sense of the term. That said, unless you're shackled by a mother's impractical sense of fear, there's technically nothing wrong with allowing people (kids, especially) to intentionally hurt themselves because (eventually) they do learn a proper respect for danger... and that can only be a good thing. That a few are lost to cracked skulls, broken bones, and death by misadventure is just natural selection at work, as crass as that may be.

So, it's a semantic thing: In the programming arena unsafe code is only unsafe for the other people (users) who will unintentionally suffer for the coder's mistakes. In that sense, I can excuse even managed code, and/or the designing of so-called safe functions as being laudable, though by no means necessary. What I take issue with is MS's nannying (and blind) insistence that potentially-unsafe and unsafe-actuelle mean the same thing. They do not.
snemarch wrote:move up an abstraction level
Curiously, having spent the majority of my life persuing abstract philosophical ends (ne perdez pas votre vie à la gagner does not mean what Google suggests it means), I lean these days more in the opposite direction... I rail against adding more abstraction, and see less (in the Buckminster Fulleresque sense) as being actually more. But I'm weird that way; easier + safer != better. :D
User avatar
nikos
Site Admin
Site Admin
Posts: 15791
Joined: 2002 Feb 07, 15:57
Location: UK
Contact:

Re: blog: program your way beyond MAX_PATH

Post by nikos »

snemarch wrote: 2017 Jan 12, 18:48 if you're dealing with paths and filenames, do yourself a favor: stop working on character buffers, move up an abstraction level and find/design yourself a Path class. It centralizes where you have to deal with these issues, reduces the risk of bugs, and makes your code clearer
you are perfectly right on this one, but hindsight is a poor ally -- only if I had thought about this 15 years ago :)
windows shell adds another layer, that of a "pidl" which is a generic path to virtual folders, but I ended up mangling that too because there are no valid pidls for deep folders.
snemarch
Bronze Member
Bronze Member
Posts: 94
Joined: 2008 Jan 15, 10:08

Re: blog: program your way beyond MAX_PATH

Post by snemarch »

nikos wrote: 2017 Jan 13, 06:43
snemarch wrote: 2017 Jan 12, 18:48 if you're dealing with paths and filenames, do yourself a favor: stop working on character buffers, move up an abstraction level and find/design yourself a Path class. It centralizes where you have to deal with these issues, reduces the risk of bugs, and makes your code clearer
you are perfectly right on this one, but hindsight is a poor ally -- only if I had thought about this 15 years ago :)
Indeed - my code from 10+ years ago also has character buffers for path stuff, and there's probably also code that assumes UCS-2 rather than UTF-16 for unicode. You live, you learn :)

Also: seems like a reply I (thought I?) made got lost :(
Kilmatead
Platinum Member
Platinum Member
Posts: 4578
Joined: 2008 Sep 30, 06:52
Location: Dublin

Re: blog: program your way beyond MAX_PATH

Post by Kilmatead »

snemarch wrote: 2017 Jan 15, 21:39and there's probably also code that assumes UCS-2 rather than UTF-16 for unicode.
Take a 1000 developers (or monkeys with typewriters, in another age) at any given moment in space and time and ask those who actually know that the 16-bit wchar_t-type is actually a variable-width encoding (i.e., recognise and correctly handle the extra 32-bit characters in UTF-16's mystical upper plane), and you'll probably find 2 with their hands in the air, and one of them was just signalling the waiter for another drink.

In other words (at least as far as I can glean) it's de rigueur to equate UCS-2 (fixed-16-bit) with Windows' wonky understanding of UTF-16 ("more-or-less" fixed-16-bit) anyway.

Anyone who ever wrote (and those who still do...) malloc( textlength * sizeof(wchar_t) ) is equally guilty of your sin (in reverse). :D

(I willingly admit as being guilty by proxy - "yes sir, officer, sir, I was in the room all right, but it was the other lads there that done throwed their deprecated TCHAR punches first - he was already down when I kicked him.") :wink:
User avatar
nikos
Site Admin
Site Admin
Posts: 15791
Joined: 2002 Feb 07, 15:57
Location: UK
Contact:

Re: blog: program your way beyond MAX_PATH

Post by nikos »

I wonder why the (exceeding) expertise on multibyte encodings K~1? do you speak mandarin or there's no pub nearby the castle?
Kilmatead
Platinum Member
Platinum Member
Posts: 4578
Joined: 2008 Sep 30, 06:52
Location: Dublin

Re: blog: program your way beyond MAX_PATH

Post by Kilmatead »

As anyone who's been lurking on this forum would know, I enjoy a certain amount of esoterica, sometimes the more arcane and obtuse the better! :D Blame that guy in Detroit and his request which led to more head scratching and research than would first appear on the surface for something so small. :shrug:

You learned this Coo-coo-ca-choo by autodidact, why can't I? :wink: All human endeavour comes to naught, so I might as well get a kick out of my obscurity! :D
Post Reply