Industry News by Ryan Veazey Aug 21, 2017 Character Gremlins - Line Endings for Dummies The problem You’re testing out a new service, you need to run over ‘https’ and you need to get an SSL certificate onto the testing box. Your server is running Linux, but you’re a Windows guy so you fire up PuTTY, open the certificate in Notepad and copy and paste the text of the certificate into the ssh session. The certificate isn’t valid. You ask your admin, and they just scp the file over, so now it works. You cat the contents and it looks the same as what you’re viewing in Notepad. What did you do wrong? What happened The most likely issue involving copy/paste and going between Windows and other operating systems is line endings. It may be obvious that text contains characters like ‘a’ or ‘z’, but it can also contain characters like a space, a tab, or what we call a newline, which is a character that means the line of text is done and a new one should be started. On Windows, this is actually represented by two characters, a carriage return (CR) followed by a line feed (LF). Virtually all other modern operating systems, including Linux and MacOS, use only the LF character. So the problem is that there is an invisible piece of extra data in our certificate. The opposite situation can also happen: LF-only line endings on Windows computers not being properly understood. Luckily, the solution is pretty simple. Confirming the issue The file program can tell you what the content of a file is, including encoding and line endings. The hexdump program can show the content of the file in hexadecimal format. Using the -C flag will show hex as well as ASCII representation (in which newlines will appear as a .) Text editors such as vim, emacs, Visual Studio Code, Atom or Notepad++ also have ways of displaying unprintable characters. Fixing the issue The simplest solution is to use very simple programs called dos2unix (and its inverse unix2dos) to convert a file from one line-ending format to another. This can generally be installed through package managers such as brew, yum or apt under the name dos2unix. The homepage, including downloadable binaries, can also be found here. Other Similar Issues Encoding We often refer to these types of files as plain text files, but it’s a little more complicated than that. Textual files still have some sort of encoding, such as ASCII or UTF-8. Each character is represented by a sequence of bytes, which may differ between encodings. If an application is expecting a specific type of encoding and gets a different one, this can cause all kinds of problems. Using the file program can help show the encoding. Quotes Some programs replace regular double or single quotes with angled quotes. If you’ve copied and pasted or edited your text in an application like Microsoft Word, Pages or some email clients, you may have quotes that the intended program doesn’t understand. Prevention One of benefit of using a software like Jungle Disk to share files is that you can ensure the exact file makes its way to the destination without any alterations. Other options include copying files via SCP or via source control systems like Git which have settings for translating to the correct format depending on your operating system. If you’re using FTP, you’ll need to make sure you understand when to use binary mode (copy the file literally) vs ascii mode (use native line endings for the OS you’re currently on). Line-ending and encoding issues are easy to diagnose and fix, as long as you remember they exist. The next time everything looks correct, but a file still isn’t being understood or parsed properly, be sure and check the line endings.