Line Endings

2024-01-27

Have you ever opened a text file that you thought would look like this, with each line on a new line...

Mary had a little lamb,
Its fleece was white as snow (or black as coal).
And everywhere that Mary went,
The lamb was sure to go.
He followed her to school one day,
That was against the rule.
It made the children laugh and play
To see a lamb at school.

        

And instead found that it either was all on one single line or it was double spaced? If you have, then you've encountered the difference between Unix/Linux line endings and Windows line endings. Over the years, this has caused a lot of annoyance among a great many people. Most folks I've discussed line endings with are in the Unix/Linux camp, but I'll tell you why I prefer the Windows way.

We all know that writing is comprised of characters such as letters, numbers, punctuation, and spaces. But how does a computer know when to put text on a new line? This is where line endings come in. In plain text files there are two "control characters": CR (Carriage Return) and LF (Line Feed).

In the early days of computing, writing text on the computer was made analogous to typing text on a type writer. In fact, the early electronic typewriters would take text files directly as their input. This is why Carriage Return and Line Feed are part of the ASCII specification. They are both analogous to functionality of physical type writers.

A Carriage Return on a typewriter sends the carriage (the little hand what goes 'whack!') back to the far left hand side of the paper without advancing the paper up. This is useful if you want to type over something you've already typed (think strikeout).

A Line Feed advances the paper up, but leaves the carriage in it's horizontal location. This is useful if you want to type vertically or stagger text, like this

Mary
    Had
       A
        Little 
            Lamb

Both have their uses individually. When used in conjunction, you'll advance the paper and also return the carriage to the far left side, thus achieving what we commonly think of as a "new line".

Unix and Linux systems use just the Carriage Return character for newlines, whereas Windows uses Carriage Return Line Feed characters together.

Because these characters have specific meanings that mean different things, and because the point of including them in ASCII was to emulate physical type writers (and also be support in teletype) I believe that Windows got it right and Unix/Linux got this one wrong.