« Back to Blog

Character Gremlins

By Ryan Veazey
Aug 21, 2017

The Problem

You’re testing out a new service you need to run over https and you need to get an SSL certificate onto the testing box. Your server is running Linux, but you’re a Windows guy so you fire up PuTTY, open the certificate in Notepad, and copy and paste the text of the certificate into the ssh session. The certificate isn’t valid. You ask your admin and they just scp the file over and now it works. You cat the contents and it looks the same as what you’re viewing in Notepad. What did you do wrong?

What happened

The most likely issue involving copy/paste and going between Windows and other operating systems is line endings. It may be obvious that text contains characters like ‘a’ or ‘z’, but it can also contain characters like a space, a tab, or what we call a newline, which is a character that means the line of text is done and a new one should be started.

On Windows, this is actually represented by two characters, a carriage return (CR) followed by a line feed (LF). Virtually all other modern operating systems, including Linux and MacOS, use only the LF character.

So the problem is that there is an invisible piece of extra data in our certificate. The opposite situation can also happen: LF-only line endings on Windows computers not being properly understood. Luckily, the solution is pretty simple.

Confirming the issue

Fixing the issue

The simplest solution is to use very simple programs called dos2unix (and its inverse unix2dos) to convert a file from one line-ending format to another. This can generally be installed through package managers such as brew, yum, or apt under the name dos2unix. The homepage, including downloadable binaries, can also be found here.

Other Similar Issues

Encoding

We often refer to these types of files as plain text files, but it’s a little more complicated than that. Textual files still have some sort of encoding, such as ASCII or UTF-8. Each character is represented by a sequence of bytes, which may differ between encodings. If an application is expecting a specific type of encoding and gets a different one, this can cause all kinds of problems. Using the file program can help show the encoding.

Quotes

Some programs replace regular double or single quotes with angled quotes. If you’ve copied and pasted or edited your text in an application like Microsoft Word, Pages, or some email clients, you may have quotes that the intended program doesn’t understand.

Prevention

One of benefit of using software like Jungle Disk to share files is that you can ensure the exact file makes its way to the destination without any alterations. Other options include copying files via SCP or via source control systems like Git which have settings for translating to the correct format depending on your operating system. If you’re using FTP, you’ll need to make sure you understand when to use binary mode (copy the file literally) vs ascii mode (use native line endings for the OS you’re currently on).

Line-ending and encoding issues are easy to diagnose and fix as long as you remember they exist. The next time everything looks correct but a file still isn’t being understood or parsed properly, be sure and check the line endings.