Let's assume you've a bunch of files (in a directory tree) on a linux/unix system and you'd like to copy them over to a Windows NTFS filesystem. The latter allows a lot less characters in filenames (and directory names), then linux/unix. The following code goes through the entire tree (starting with the current working directory) and removes all invalid characters from directory entries. Note that it relies on a few non-standard extensions (eg. not all
find
implementations have a
-print0
option.
find . -depth -mindepth 1 -print0 | while IFS="" read -r -d "" entry; do if [ -f "${entry}" ]; then b="$(basename "${entry}")"; n="$(echo "${b}" | tr -d '\001-\037/\\:*?"<>|')"; if [ "${b}" != "${n}" ]; then d="$(dirname "${entry}")"; [ -f "${d}/${b}" ] && mv "${entry}" "${d}/${n}"; fi; fi; done
P.S.: I used
David's writeup on how to process directory entries correctly and the
Wikipedia article on NTFS for the list of valid characters.
P.S.2: Beware that simply removing invalid characters might result in data loss since several filenames can be converted to the same string this way. Eg. both the filename "my test?file.txt" and the filename "my test:file.txt" will be converted to "my testfile.txt" and only one will be kept. If you really need to cover such special cases, you could replace invalid characters with a number (instead of simply removing the invalid characters) and increment this number after each processed file (ie. directory entry). This way you could be sure that no file is lost during the process.
Comments
A couple of sugestions
1. Please format the above command with newlines so that it is more readable.
2. Please include solutions for removal of initial and trailing spaces as well as trailing dots (".").
3. '-d' option for 'read' is a bashism - only '-r' is defined by POSIX.
4. Using 'mv -i' will prompt before overwriting existing file or directory.
Cheers,
rjc
Re: A couple of sugestions
sed
search&replace should take care of whitespace at the start and end of filenames and trailing dots as well):find . -depth -mindepth 1 -print0 | while IFS="" read -r -d "" entry; do
if [ -f "${entry}" ]; then
b="$(basename "${entry}")"
n="$(echo "${b}" | tr -d '\001-\037/\\:*?"<>|' | sed -e 's/^[ \t]\+//g' -e 's/[ \t.]\+$//g')"
if [ "${b}" != "${n}" ]; then
d="$(dirname "${entry}")"
[ -f "${d}/${b}" ] && mv "${entry}" "${d}/${n}"
fi
fi
done
mv -i
It's not a rename confirmation, it's an overwrite confirmation.
Re: mv -i
Thanks for this, with one modification