When Directory.GetFiles gets crappy and grabs *.htm but not *.html (here's one reason why)
The Old New Thing - Why does the Directory.GetFiles method sometimes ignore *.html files when I ask for *.htm?
...
A customer reported that one of their programs stopped working, and they traced the problem to the fact that a search for
*.htm
on some machines was no longer return files likeawesome.html
, contrary to the documentation. What's going on?What's going on is that the documentation is trying too hard to explain an observed behavior. (My guess is that some other customer reported the behavior, and the documentation team incorporated the customer's observations into the documentation without really thinking it through.)
The real issue is that the
GetFiles
method matches against both short file names and long file names. If a long file name has an extension that is longer than three characters, the extension is truncated to form the short file name. And it is that short file name that gets matched by*.htm
or*.txt
.Even as originally written, in the presence of short file names, the documentation is wrong, because it would imply that a search for
reallylong*.txt
could matchreallylong_filename.txtother
. But try it: It doesn't. That's because the short name is probablyREALLY~1.TXT
, and that doesn't matchreallylong*.txt
.What happened is that short file name generation was disabled on the drive at the time the files were created, so there was no short file name available, so there was consequently no
SHORTN~1.HTM
file to match against....
This is one of those things that you might never find and might never even know you might not find. No exception, it just doesn't work as expected...
Trust, but verify.
No comments:
Post a Comment