Wednesday, December 01, 2004

Unicode Path Fun...

In my industry/field (EDD) we run into many "fun" issues. This week's issue is long paths. Files get copied quite often during EDD processing. No one wants to touch original files, so drive imaging is used all the time. So that's a copy...

Then in many cases the files get staged somewhere. Not wanting to touch where a file is located on the image, the original path is appended to the staged location.

Then the files get copied again and again and again during the different phases. Since each phase (handled by different companies, people, processes, OS's, applications, etc) wants to maintain the "original" paths, these paths are appended to any new path.

For example;

C:\test.txt <- Original C:\test.txt <- Drive Image

\\Server\Share\Client\Custodian\Machine\C_Drive\ test.txt <-Process 1

\\Server2\Share\ClientX\Client\Custodian\Machine\C_Drive\ test.txt <-Process 2

\\Server3\Share\Stuff to do\Outsourceclient\Client X\Client\Custodian\Machine\C_Drive\ test.txt <-Process 3

\\AnotherServer\Share\WorkForClientY\OutputFromUtility\Stuff to do\Outsourceclient\Client X\Client\Custodian\Machine\C_Drive\ test.txt <-Process 4

and so on, and so on...

Which means that at the end there's a good chance that some of the data will be in paths longer than MAX_PATH (260 ANSI characters). Actually not just a good chance as it does happen. I've seen some scary path lengths.

This means each system, solution, process, etc must determine a way to handle these. Right now the solution I am involved with handles these manually (i.e. with human intervention). We've been working on our process stack to automate long path handling and now we're at the last item. Which is where my fun begins today.

Today I am playing with the Unicode FindFirstFileW, FindNextFileWand CopyFileEX API's, building proof of concept apps to play with the API's, sample cases, etc. Luckily we already use the ANSI version of these API's so the conversion to the Unicode version shouldn't be too bad. Have to make sure my null trimming handles wide characters, etc, etc.

Building the sample case was kind of fun. I needed a deep path to play with...

Using the latest version of robocopy (XP010) I created a deep path (60 levels deep for now). Of course Windows Exploder (err, I mean Explorer) can't navigate past MAX_PATH nor can any of the standard command line utilities. So on my Windows box, I have to use the GNU/Unix/SFU utilities to transverse/move/delete these long paths... I find that ironic.

Well back to coding... :)

2 comments:

Anonymous said...

Greg,

Did you ever find a solution for how to deal with deep paths. I am building an app that has to copy files from one location to the other and deal with deep paths.

Chris

Greg said...

You've reminded me that I meant to write this up in a more complete post...

Yep, the "wide" API's did the trick for me.

There are some tricks to using them though in VB6. The paths need to have the correct unicode path prefix "\\?\" for local HDs or "\\?\UNC" for network shares.

Also the W api's need pointers to the strings passed to it (where the A versions don't).

Here's the VB6 declares I used (note the how the strings are typed as longs).

Private Declare Function FindFirstFile Lib "kernel32" _
Alias "FindFirstFileW" _
(ByVal lpFileName As Long, _
lpFindFileData As WIN32_FIND_DATA) As Long

Private Declare Function FindNextFile Lib "kernel32" _
Alias "FindNextFileW" _
(ByVal hFindFile As Long, _
lpFindFileData As WIN32_FIND_DATA) As Long

Private Type WIN32_FIND_DATA
dwFileAttributes As Long
ftCreationTime As FILETIME
ftLastAccessTime As FILETIME
ftLastWriteTime As FILETIME
nFileSizeHigh As Long
nFileSizeLow As Long
dwReserved0 As Long
dwReserved1 As Long
cFileName(MAX_PATH * 2 - 1) As Byte
cAlternate(14 * 2 - 1) As Byte
End Type

So to call it, you have to use the StrPrt function.

hFile = FindFirstFile(StrPtr(sFindFileStartPath), WFD)


Anyway, I hope this helps a little...