Modern C++ isn’t memory safe, either

327
Modern C++ isn’t memory safe, either

Here we can see, “Modern C++ isn’t memory safe, either”

Some issues have been resolved, while others have been introduced due to recent language upgrades.

Apart from using the wrong tool for the job, adamantly pushing a language objectively/demonstrably inferior at x out of blind loyalty, bashing on languages you’ve never used or studied simply because you’ve seen firsthand how well such comments can be received, and worse, acting on stale information that no longer holds true, a recurring theme in almost all discussions revolving around the comparison of programming languages is acting on stale information that no longer holds true.

We don’t just have one dog in the race at NeoSmart Technologies; our software is written in a variety of languages, including C/C++, desktop/web C#/ASP.NET, rust, [JS|TypeScript]/HTML/[LESS|CSS], (ba)sh scripting, and more.

Also See:  There Was a Problem Accessing Your Keychain

1 So it’s always fascinating to watch these debates (sometimes up close and personal, sometimes from afar) and see what arguments linger after the dust has cleared and the troops have returned home for the day.

As previously stated, one of the most crucial things to remember while discussing programming languages is the tool-job link. While the languages we code in are different, there is little overlap in practice: no one in their right mind would use a shell script to construct a complex GUI for an enterprise product (or would they?) When it comes to building kernel drivers, JavaScript isn’t nearly the ideal tool for the job. 2 Although work on native, dependency-free AOT compilation of .NET Core code has progressed in recent weeks, low-level, low-dependency apps built-in C or C++ definitely shouldn’t be ditched in favour of some.NET Core code anytime soon — but what about rust? While C# may not have made it an explicit objective to replace ancient C and C++ code, rust has.

There’s no getting around the fact that rust (the language) exposes significantly more information to help rustc (the compiler) protect developers from hanging themselves with their own rope. Still, there are arguments to be made for the benefits of recent improvements and advancements in the C++ world that have made the manual allocation of pointers a significantly less-necessary evil, combined with toolchain improvements that have improved static analysis and provided better performance.

Many rust champions who haven’t come from the C++ world are unaware of these enhancements, and it’s not unreasonable to expect a rustacean retort along the lines of ” and I’ll never have to use pointers ever again!” (or something equally overdramatized and untrue) in a “C++ vs rust” debate, to which a C++ developer who has embraced the outpouring of new features in the language and standard library since

While it’s true that shared ptr and co (perhaps most notably unique ptrT>, which adds almost rust-like qualities to the language but is unfortunately hidden behind (officially) heap-allocated memory and a pointer indirection) have made C++ a significantly faster, safer, and more productive language to write code in, it’s also untrue that C++11 onwards has made it easier to avoid lifetime issues, due to a problem seen far too

The issue is that, while C++11 provided shared ptr, unique ptr, and a slew of other features, it also introduced a terribly misunderstood, impossible to adequately vet feature called capture-by-reference for variables used in lambdas, which has resulted in a slew of lifespan concerns. At their core, lambdas’ reference-captured variables are no different from functions returning references to locally-scoped variables: a clear no-no. But the trouble with lambdas is that they complicate the issue and make it too simple to pass in one or more variables by reference (especially with the nasty, never-should-have-been-adopted capture all by reference [&] lambda operator, which should be a hard warning by default).

The following code snippet demonstrating the use of a function-local variable provided by reference to lambda is fundamentally the same as a naive return of a locally-scoped variable by reference in “old” C++:

auto helper1(std::string str) -> auto {
	std::string capital;
	std::transform(str.begin(), str.end(),
		std::back_inserter(capital), ::toupper);
	return [&capital] () {
		std::cout << "HELLO, " << capital << "!" << std::endl;
	};
}

int main(int argc, const char *argv[]) {
	auto callback1 = helper1("Mahmoud");
	callback1();
}

It’s easy to argue that the invalid (or, more technically, the use-after-free) memory access above is clear and a mistake that no one could ever fall for twice, given that we went into the very-much contrived code example above already knowing what to look for. You’d be surprised, though. Previous “idiomatic” uses of pass-by-reference in C++ (those used by the average programmer rather than experts or beginners) typically took pass-by-reference arguments to a function or return-by-reference from a function with the former, probably outnumbering the latter by at least 500:1.

I’m not going to say it’s impossible for an argument handed in by reference to refer to erroneous memory, but it’s a safe and common event. For example, a variable is declared before the function is called, and a reference to it is passed to the function so that the called function can manipulate its existing value. This is most commonly used as a workaround for the lack of tuples in C++ (the language), and functions could only return a single result:

bool foo(int input, int &output) {
	bool ok = true;
	//do something here that might fail
	if (ok) {
		output = result;
	}
	return ok;
}

In the code above, the reference is used as a more-or-less comparable replacement for C’s pointers, preventing null dereferencing that would have occurred if int &output had been replaced with int *output instead – exactly as ref in C# would be used to achieve the same result (before the language grew up and learned to use tuples).

Also See:  Gas Canisters in Warzone Are an Excellent Counter to Zip Liners, Player Discovers

Most C++ developers will struggle to tell you when to use return-by-reference, or even if it’s even legal (much alone safe) to write something like this:

class Singleton {
public:
	int x;

	int &retrieve_reference() {
		return x;
	}

	void print() {
		std::cout << "x is " << x << std::endl;
	}
};

Singleton singleton;

int main(int argc, const char *argv[]) {
	int &x = singleton.retrieve_reference();
	x = 42;

	s

You’ll need the function declaration with the return-by-reference when you need it; developers never return by reference unless they have a strong justification for doing so. And the only time you’d need to utilize it is if the caller needs to be able to change the state of an existing variable, which means that the variable must be alive when the function returns. So, for example, consider the following code fragment: disregarding the memory issues, what is the aim of code that effectively boils down to this:

int &foo() {
	int x = 7;
	return x;
}

But, starting with C++11, it’s become all too easy to build code that does exactly that, albeit more subtly. The insidious “capture-all-by-reference” [&] makes it all too likely that a (lazy) developer will use it as a shortcut while fully aware of what variables are being referenced and when they are expected to go out of scope, only for a future commit (by the same developer, no less!) to introduce access to a different variable into the lambda, resulting in a use-after-free vulnerability.

Lambdas are a natural breeding ground for use-after-free errors because variables captured by value in lambda are read-only by default (unless mutable is used). Lambdas are frequently used to reduce code repetition and eliminate some copy-and-paste errors by modifying variables declared in an outer scope from within a lambda (only possible when variables are captured by reference). The amount of literature – both online and offline – claiming lambdas “extend” the lifespan of variables “caught” by value is exacerbating the situation (when, in reality, a copy of that variable with its value captured having a lifetime equal to that of the lambda itself is created).

These gotchas don’t apply if you limit yourself to a subset of C/C++ that avoids pointers and references and only exchanges state between scopes using reference-counted smart pointers or passing-by-value. While references were a huge improvement over pointers, expecting references to eradicate memory problems instantly is either intellectual dishonesty or sheer naiveté. It’s crucial to remember that memory access violations happen in various forms and sizes and that replacing C pointers with C++ references only solves one type of problem (null dereference).

But I think we can all agree that if you’re forced to use reference-counted, heap-allocated variables for almost all of your code, almost all of the reasons to adopt C++17 vanish. It won’t be as sluggish as *insert name of your least favourite interpreted language here*, but it’ll be a lot more verbose!

— Bonus Content Addendum —

If you’re still not convinced, here’s an example of std:: code.

shared ptr is still vulnerable to this problem, but not as visibly:

#include <set>
#include <functional>
#include <iostream>
#include <memory>

class int_wrapper;
std::set<int_wrapper*> destroyed;

class int_wrapper {
	int _value;

public:
	int_wrapper(int x) {
		_value = x;
	}

	~int_wrapper() {
		destroyed.insert(this);
	}

	int get_value() {
		if (::destroyed.find(this) != ::destroyed.end())
		{
			std::cerr << "mayday! get_value() called against"
				" destroyed object!" << std::endl;
		}
		return _value;
	}
};

std::function<void ()> bar() {
	std::shared_ptr<int_wrapper> ptr;

	auto helper = [&](int x) {
		for (int i = 1; i <= x; ++i) {
			if (i == (rand() % x) || i == (rand() % 3)) {
				std::cout << "Picked a number!" << std::endl;
				ptr = std::make_shared<int_wrapper>(i);
				break;
			}
		}

		int to_print = ptr->get_value();
		std::cout << "The pointless convolution returned "
			<< to_print << std::endl;
	};

	bool condition1 = (rand() % 20) == 2;
	if (condition1) {
		return std::bind(helper, 12);
	}
	else {
		return std::bind(helper, 5);
	}
}

int main(int argc, const char *argv[]) {
	srand(42);

	auto foo = bar();
	foo();
}

The issue is that, while shared ptr and co. are official “reference” counting methods of automatic memory allocation and deallocation, what is left unsaid is that what shared ptr considers to be a reference is not the same “reference” that C++ (the language) uses (which is surprising until you consider it). A C++ reference to a shared ptr instance is a distinct/separate variable, but a C++ reference to a shared ptr instance has no counterpart.

Conclusion

I hope you found this information helpful. Please fill out the form below if you have any questions or comments.

User Questions 

1. Why is C++ seen as risky?

C and C++ are inherently dangerous because performing an erroneous operation renders the entire program useless, rather than just the erroneous operation having an unknown outcome. Therefore, erroneous operations are stated to have unclear behaviour in these languages.

2. Why isn’t C a memory-safe language?

Because its runtime error detection checks array boundaries and pointer dereferences, Java is said to be memory-safe. C and C++, on the other hand, allow unconstrained pointer arithmetic with pointers implemented as direct memory addresses and no bounds checking, making them potentially memory-unsafe.

3. Is C++ a Secure Language?

C++ is dangerous for people who aren’t used to carefully inspecting the implications of every single line of code. It’s safe as long as you know what you’re doing and does all necessary testing before releasing the product. C/C++ is a highly dangerous language.

Also See:  Pinterest Launches a Creator Code to Promote Positivity

4. C++ is unsafe? : r/cpp – Reddit

C++ is unsafe? from cpp

5. What would it take to make C++ safer and eliminate common pitfalls?

What would it take to make C++ safer and eliminate common pitfalls? from cpp