Thanks!

You are now subscribed to our monthly blog digest. Happy reading!

Click anywhere to dismiss ...

What's in a File Name?

The Problem

You’ve just finished the breathtaking introductory slide of your quarterly sales performance presentation. This is going to get your name out there - better save early and often. You don’t keep the most organized files; you’ve lost time before due to badly named documents. Wanting to be explicitly clear, you christen the slideshow Jamie's Quarterly Sales Performance w/ Graphs (you’re going to kill it with those doughnut charts).

You want to get feedback before your presentation, so you share it with some colleagues via a combination of network drives, online file sharing and, of course, email attachments. The reviews start to come in and they’re… all related to difficulties in accessing the file. What happened?

Concepts

File Extensions

A file extension is a convention of appending a short identifier for the type of the file after the name, separated by a dot. For example, in dog.jpg, jpg would be the file extension. This is used by many programs to guess the type/content of the file without having to read anything inside the actual file. It can also be useful for things like searching for specific types of files. Some programs require that a file extension map to the actual type of the file, but, in general, the extensions can be applied (or removed) arbitrarily (though you should probably have a good reason for doing so).

Operating Systems and File Systems

Your Operating System (OS), such as Windows, Mac, or Linux, may have certain rules or expectations about what constitutes a valid file name. Additionally, your file system (such as NTFS on Windows, HFS+ or APFS on Mac, or any number of file systems on Linux) may have additional restrictions or complications. There’s a good chance that your laptop hard drive, your thumb drive, and your cloud server are all using different file systems.

Case Sensitivity

Most modern file systems treat upper and lowercase letters as separate entities entirely. A file test.txt is different from Test.txt, and they both can (but should not; see below) live in the same directory.

Windows, however, is “case insensitive but case preserving.” This means that it will save the case of the letters you use in a file name, but it will recognize that file with the same characters of any case. In these systems, you cannot have the two filenames mentioned above in the same directory.

Quoting, Escaping, Encoding

Sometimes a name needs to be displayed in a type of document or data structure such that the name itself has characters that are in some way incompatible with or confusing in the context of the document. In these situations, one of the following strategies will generally be employed:

Quoting

Place the name between quotes. Often used in a command-line environment

"my filename with spaces"

Escaping

Prefix the troublesome character with another character to signify a deviation from its usual use. This is often done with a \ character.

Use a quote inside of a quoted string

`"a filename with \"spaces\" and a quote"

Encoding

This is often used in URLs. Characters that cannot be directly included are translated to an alternative representation. URLs, for example, cannot contain spaces, so they must be replaced with %20 or +.

Always Safe

The following should never cause problems and can be used without issue.

  • Any uppercase or lowercase letter on a standard US keyboard should never cause a problem.
  • Numbers 0-9
  • Underscores _
  • Dashes -

Generally Safe

These shouldn’t cause problems but have more potential to do so than the above

Spaces

Spaces are generally safe, though they can make some things cumbersome, such as dealing with a file on a command line or in a URL.

.

Most files have a single dot, but more shouldn’t generally be a problem. There was a time when some systems would not allow a . in the file name except to separate the name from the extension (such as txt, jpg, or doc). Some very old software may make incorrect assumptions about the content of the file, but if you’re in this camp, you’re probably already using very conservative filenames.

It depends

Depending on which systems you use, these may be just fine or something to avoid

Non-English Characters and Emoji

Unicode support has come a long way, and Macs and current Windows computers should not have problems with a filenames such as

  • かわいい犬.jpg
  • Blue Öyster Cult.mp3
  • 🐈.doc

Older Windows versions, some portable drives, and some Linux GUIs and file systems (since so many are supported) may have issues.

Best to Avoid

Some of these may be impossible on your system. Some may be possible but difficult to create. It’s best to steer clear of these altogether.

/ and \

Windows officially separates directories with the backslash \ character, while most others use a forward slash /. However, Windows will generally understand a / to mean the same thing when specifying a file path.

:

Old Macs (before OSX) separated directories with :. Current ones will allow : as part of a file name in some contexts, but will treat it like a / in others.

Windows uses colons to denote drive letter.

Newlines

You generally have to go out of your way to include a newline in a file name. Don’t.

All kinds of quotes

While these generally won’t cause problems, some software may become confused when trying to quote (see above) the filename. Additionally, the single ' and double quote " may be unintentionally translated in email clients, word processors, messaging apps, etc into the very similar but different angled quotes .

< >

Not only are < and > used in most command line environments to redirect streams, they can also cause all kinds of headaches when they need to be included into XML or HTML documents.

&

Can cause confusions in the command line, URLs, and XML documents.

@

Do you want your file to show up as an email links? Because that’s how you get your file to show up as an email link.

* and ?

These are used for globbing patterns when searching for filenames.

|

The pipe | character is used to pipe information from one command into another.

Conclusion

Keeping things simple while naming can prevent trouble for you, and anyone who uses your files, down the road.

Protect Your Business Data

We are passionate about helping our customers protect their data. We want you to use Jungle Disk to protect yours. Click on Sign Up to get started. It takes less than 5 minutes!

Sign Up