Skip to content

Instantly share code, notes, and snippets.

@louis-langholtz
Last active October 24, 2023 02:40
Show Gist options
  • Save louis-langholtz/9959fbc735a23b631e7d795d4eb0839f to your computer and use it in GitHub Desktop.
Save louis-langholtz/9959fbc735a23b631e7d795d4eb0839f to your computer and use it in GitHub Desktop.
C++: More Reasons To Avoid std::endl

C++: More Reasons To Avoid std::endl

In the realm of death by a thousand cuts, please stop using std::endl and remove it from code. Per cppreference.com, std::endl "inserts a newline character into the output sequence... and flushes it". Okay, so why exactly shouldn't it be used???

The Issues

There have been numerous articles written about endl that echo the sentiment against using it. Here's some:

They highlight the following:

  1. Flushing the stream slows performance and usually isn't intended nor necessary.
  2. Using the newline character, \n, is just as cross platform compatible for stream output as using endl, but it doesn't force flush the stream.

Agreeably, for environments where speed isn't a concern, these points may not seem that big of a deal. There are additional less-mentioned concerns that I'd like to point out however.

Before flushing the stream, C++ standard, implementation, and compiler dependent, endl also:

  1. Executes code to possibly widen the newline (using std::basic_ios<CharT,Traits>::widen). This code:
    1. Gets the current locale associated with the stream (std::ios_base::getloc).
    2. Uses this locale's "facet" (via a call to std::use_facet) for the character type of the stream to do the possible widening. For details, see the source code for use_facet for LLVM, or Microsoft standard C++ libraries. This is turn:
      1. Checks that std::has_facet<Facet>(loc) is true for the locale and the facet identified for the stream.
      2. Throws std::bad_cast if this check is false.
      3. Returns the available identified facet otherwise.
  2. Executes code to output the possibly widened newline to the output stream (through a call to std::basic_ostream<CharT,Traits>::put).

So using endl comes with more baggage than just force flushing the stream; none of which most programmers using endl that I've spoken with seem aware of.

Not to mention:

  1. endl arguably doesn't express intent as well as alternatives.
  2. endl's use appears to be everywhere. If we don't put more effort into getting rid it, it's more likely to be perpetuated especially by newer programers to even more code where its use is a big deal.

All this can result in things like unintended behavior, surprising performance loss, and unnecessarily enlarged executables. At the time of writing this, Wikipedia redirected the disambiguation of "death by a thousand cuts" for psychology to Creeping normality. That seems spot on for endl unless we do more to curtail its use.

Alternatives

Instead of using std::endl, just use \n:

void someFunction(std::ostream &os)
{
  os << "Hello world!\n";
}

Or, if you really do need to force flush the output stream, be explicit about it and use the std::flush I/O manipulator:

void someFunction(std::ostream &os)
{
  os << "Hello world!\n" << std::flush;
}

Example Code

Take the following code for example:

#include <iostream>
#include <ostream>

void doEndl()
{
    std::cout << "Hello world!" << std::endl;
}

void doNewline()
{
    std::cout << "Hello world!\n";
}

Compare the resulting assembly of the two functions for yourself.

Assembly Code When Using std::endl

For the record, here's what Compiler Explorer shows gcc 12.1 generates for the x86-64 target just for the doEndl function:

.LC0:
        .string "Hello world!"
doEndl():
        push    rbx
        mov     edx, 12
        mov     esi, OFFSET FLAT:.LC0
        mov     edi, OFFSET FLAT:std::cout
        call    std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
        mov     rax, QWORD PTR std::cout[rip]
        mov     rax, QWORD PTR [rax-24]
        mov     rbx, QWORD PTR std::cout[rax+240]
        test    rbx, rbx
        je      .L10
        cmp     BYTE PTR [rbx+56], 0
        je      .L5
        movsx   esi, BYTE PTR [rbx+67]
.L6:
        mov     edi, OFFSET FLAT:std::cout
        call    std::basic_ostream<char, std::char_traits<char> >::put(char)
        pop     rbx
        mov     rdi, rax
        jmp     std::basic_ostream<char, std::char_traits<char> >::flush()
.L5:
        mov     rdi, rbx
        call    std::ctype<char>::_M_widen_init() const
        mov     rax, QWORD PTR [rbx]
        mov     esi, 10
        mov     rax, QWORD PTR [rax+48]
        cmp     rax, OFFSET FLAT:_ZNKSt5ctypeIcE8do_widenEc
        je      .L6
        mov     rdi, rbx
        call    rax
        movsx   esi, al
        jmp     .L6
.L10:
        call    std::__throw_bad_cast()
_GLOBAL__sub_I_doEndl():
        sub     rsp, 8
        mov     edi, OFFSET FLAT:_ZStL8__ioinit
        call    std::ios_base::Init::Init() [complete object constructor]
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:_ZStL8__ioinit
        mov     edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
        add     rsp, 8
        jmp     __cxa_atexit

Assembly Code When Using Newline

Meanwhile, here's what Compiler Explorer shows gcc 12.1 generates for the x86-64 target just for the doNewline function:

.LC0:
        .string "Hello world!\n"
doNewline():
        mov     edx, 13
        mov     esi, OFFSET FLAT:.LC0
        mov     edi, OFFSET FLAT:_ZSt4cout
        jmp     std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
_GLOBAL__sub_I_doNewline():
        sub     rsp, 8
        mov     edi, OFFSET FLAT:_ZStL8__ioinit
        call    std::ios_base::Init::Init() [complete object constructor]
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:_ZStL8__ioinit
        mov     edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
        add     rsp, 8
        jmp     __cxa_atexit

That's just sixteen lines of assembly for doNewline compared to forty-five for doEndl!

@louis-langholtz
Copy link
Author

@grantrostig I like the questions you're asking! Biggest reason I like them: C++'s as-if rule.

In short, my interpretation of this rule is that basically a standards conforming compiler is at liberty to do what it wants to with only a few limits. This really excites me about C++ and gives rise to possibilities like zero overhead abstractions such as boost's units library. And applied to your questions, the answer is: it depends.

It depends on things like:

  1. what the compiler chooses to do, and
  2. what optimization level we select.

I know all this, yet I admittedly guessed compilers like gcc and clang would avoid emitting assembly that exhibited an N copies of M lines of code pattern in favor calling a function of M, N times. I was wrong! I suspect they're using their inlining logic for this that's like: inline until believed resulting binary would be slower. Note that I have -O3 enabled. Changing to using O0, I see less inlining and more function calling. Which was more like I was expecting. Note also that this is just looking at the lines of assembly code and using that as a gauge of how large a resulting binary output file would be. That's more reasonable for conventional CPUs than less conventional ones which may add a whole extra layer of complexity to the picture.

For more specifics, try with the compiler of your preference, with the compiler options of your choosing, and check the size of the resulting binary executable output file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment