Skip to content

Instantly share code, notes, and snippets.

@danthe1st
Last active November 10, 2024 15:24
Show Gist options
  • Save danthe1st/1a7879ccf70db9092ff189370527bb31 to your computer and use it in GitHub Desktop.
Save danthe1st/1a7879ccf70db9092ff189370527bb31 to your computer and use it in GitHub Desktop.
My expectations for programming languages

Introduction

I like using the Java programming language for many reasons. While there are some things about Java I would prefer to work in other ways, it does a lot of things right and I consider it "better" for me (meaning it matches my expectations more) than other programming languages. When I write code, I want to be able to do it as I do it in Java and these expectations can be seen as prerequisites for that.

The purpose of this gist is to show some reasons I prefer Java over other languages. It lists a few things that annoy me with other programming languages. This may contain controversial opinions and is by no means exhaustive.

Fulfilling the criteria listed here doesn't make something a good language alone but this gist lists some expectations I have from languages that are necessary for me to like it.

My expectations

OOP is natural

First of all, I think object-oriented programming is a natural way of modelling relations and writing code. I like the idea of the data and the code being allowed to interact with the said data being together at one place (a class).

All in all, I think the way object-oriented programming is handled in Java (particularly that methods are generally part of classes and that methods have to be explicitly made static) results in code that is easy to read. Java encourages writing OOP code by default which I think is a good thing.

Dynamic capabilities and runtime metaprogramming

Java provides many dynamic capabilities giving developers tools that wouldn't be available in many other languages. For example, the Reflection API allows to introspect and access/invoke members of classes while there are more advanced mechanisms like runtime class loading, agents, classpath scanning or dynamic proxy classes. While these tools should be used with care, they can be really useful in some situations and allow to make use of paradigms like (runtime) Aspect-Oriented Programming.

Static typing with another dynamic type system

When I write a Java program, I know the (static) type of any variable I am dealing with and I can be sure that the methods and fields of these variables actually exist and contain what I expect.

The dynamic type system then gives me capabilities like runtime polymorphism giving me the ability to change the ability to add more information/functionality or specialize (override) implementations in subclasses.

nominal typing

Java uses nominal typing for most things. Types have a name and two types with different names are different even if their components are identical.

For example, with the following (record) classes, I don't want instances of Person/Product to be assignable to variables of the other class:

record Person(String name, int age) {}
record Product(String name, int priceInCents) {}

This wouldn't be the case when storing them as structual type of {String name, int age} (possibly with a type alias).

While generics are a form of structural typing, I do think generics are necessary and make things easier and more readable overall.

checked exceptions

When throwing a checked exceptions, I can be sure it is handled in some way and not forget about it. While it is possible to catch checked exceptions and suppress them, doing so is a concious choice.

private void someCaller(){
	try {
		otherCaller();
	} catch(MyException e) {
		//if I don't re-throw the exception, I have to catch it and write logic handling it
		informUser(e);
	}
}

private void otherCaller() throws MyException {//I have to acknowledge the exception and opt-in to throw it to the caller
	methodThatMightFail();
}

private void methodThatMightFail() throws MyException {
	throw new MyException();
}
public class MyException extends Exception{}

Other than Java, there are unfortunately not many languages providing checked exceptions. There is a similar approach of using return types in some languages but this makes it a bit annoying to propagate that error state. Aside from that, if these methods don't return any result, calling them silently swallows the error state (and "must call"/"must use" requirements for returned values are kind of unstable/hard to properly enforce):

private void caller() {
	//I don't like this
	methodThatMightFail();//method can be called without acknowledging the error state if the result is not needed
}

private ResultState<Void> methodThatMightFail() {
	return new ResultState.Failure<>("this method failed");
}
public sealed interface ResultState<T> {
	public record Success<T>(T value) implements ResultState<T> {}
 	public record Failure<T>(String message) implements ResultState<T> {}
}

C-like syntax (for control flow and method calls)

Most languages use syntax that is somewhat similar to C in terms of basic building blocks. By that, I mean using curley braces in blocks, conditions using if(condition) { /*statements here*/ }, for(int i=0; i<1337; i++) {/*statements here*/}, the amount of whitespace not affecting semantics, method invocations using someMethod(arguments, here); or method declarations like this:

ReturnTypeHere methodName(Parameters here) {
	//body
	return result;
}

Having many languages use similar syntax for the basic building blocks allows people who know one language to understand code written in another. This doesn't mean every language should have the exact same syntax as C but that there are some things done in many languages and unless there is a significant benefit in one approach, I prefer using the one that is common.

no abbreviations in keywords

I don't like the idea of using abbreviations as keywords since it adds ambiguity into a language. For example, some languages use abbreviations of function (like fn, fun, func or def) for function declarations.

//I don't like that
pub func someMethod(){
	//some code
}

In my opinion, keywords shouldn't be abbreviated as using abbreviations for keywords leads to code that's hard to read (especially to people not that used to the language).

Also, I don't see any good reason to use a function (or whatever variant of that) keyword (in a statically typed language). At that point, just let me specify the return type instead of it so I always see the return type whenever looking at the function. Adding such a keyword is just unnecessary.

I can infer where the called code (declaration) is

When reading Java code (possibly outside of an IDE, e.g. when viewing code on GitHub), it is normally quite easy to find out something I'm accessing/calling is declared. For example, method calls are normally someMethod() for methods declared in the same class, a superclass or statically imported methods (where the import tells me the directory file name to look for), SomeClass.someStaticMethod (where SomeClass is imported giving me the directory and file name to look like - in most cases (when SomeClass is a top-level class), it is in a file called SomeClass.java) or someMethod.someClass where I can check the type of the object.

While wildcard imports are possible, most Java code uses static wildcard imports rarely. There's typically not more than one static wildcard import per file (in most cases there is none). And even for non-static wildcard imports (for which there are typically also not many in one file), I know the exact file name of the file to look for.

So all in all, when I see a method call or similar, I can easily find out the declaration and corresponding documentation, even when working outside of an IDE. This is in contrast to some other languages where functions are (often) top-level elements. For example, take the following code:

#include "a/someLibrary.h"
#include "b/otherLibrary.h"
#include "c/yetAnotherLibrary.h"
//many other includes

int main() {
	someFunction();//where is this declared?
}

functions shouldn't be variables unless explicitly stated

When I declare a function/method, this should be a function/method and not a variable that can possibly be reassigned. I don't want to see code like this be possible:

void someMethod() {
	//some code
}

void otherMethod() {
	//other code
}

void blackMagic() {
	//I don't like this
	var tmp = someMethod;
	someMethod = otherMethod;
	otherMethod = tmp;
}

Similarly, functions being different from variables allows being able to use the same name for both a variable and a function:

private int something = 1337;
public int something(){
	return something;
}

proper Open Source IDEs and tooling that works on any OS

In my opinion, there should be an independent, Open Source IDEs that work platform-independently (on Linux, MacOS and Windows) with powerful tooling tailored to the language. With Java, this is clearly the case. While I personally prefer using Eclipse which is completely Open Source, not controlled by a single company and is platform-independent, IntelliJ is a good option as well (though technically not completely Open Source).

Both Eclipse and IntelliJ have excellent support for Java, build tools commonly used in Java projects (mainly Maven and Gradle), various Java frameworks and other JVM-related tooling. This doesn't mean that e.g. Visual Studio Code is bad in any way, but in contrast to Eclipse and IntelliJ, it wasn't built specifically for Java and therefore has less Java-aware tooling.

This includes advanced debugging features like tracepoints/trigger points which is aware of Java threads, powerful rewrite and code analysis tooling and things like specific tooling for Java EE/Jakarta EE, Spring or similar.

Aside from IDE integrations, there is also a lot of other developer tooling like OpenRewrite or Checkstyle.

Maven

I really like how Maven works in contrast to some other languages. First of all, it stores dependencies in a local repository that is shared between applications and each application can use whatever version it likes. I don't have the same dependencies multiple times if I use them in multiple time. Each version of a dependency is there only once.

Other than that, the pom.xml file is very powerful when it comes to configuration. Maven doesn't just manage dependencies, it controls the way the build works, packaging, deployment and whatever else should be part of the build while also avoiding the trap of tempting developers to write custom build code in the build configuration file.

no (unrestricted) operator overloading

Some languages support operator overloading. This is a controversal feature as operator overloading can be tempting to overuse in a way that code can become unreadable. Even if I don't write code using operator overloading, I might still have to read code using it if the language I'm using supports that.

While I am not completely opposed to operator overloading, I dislike the idea of unrestricted overloading and prefer not having operater overloading as opposed to having it without any restrictions in a way that encourages writing unreadable code.

I don't know how to implement operator overloading well and which restrictions should be implemented in which way so I would prefer not having it until I see a good way to do it.

For example, if I see code c = a + b;, I would expect that to set c to some value without modifying a, b or the old value of c. I would expect the operation to be in some way similar to addition (for example having neutral/identity elements and I would expect associativity as long as different types are not mixed) but I have no idea on how that should be enforced. Similarly, I think that allowing to overload arbitrary operators (like $, # or similar) is a bad idea as these don't have a clear, general meaning within the language.

devs not trying to convince me the language would be better

With many modern or functional languages, there is a disproportionate amount of devs annoying others saying that language would be better than whatever language other people use. While liking a language is perfectly fine, trying to convince others who aren't interested in it is just annoying. Languages are generally not objectively better than other languages. In that regard, Java has the advantage that it's a fairly old and commonly used language so people are less likely to be trying to convince others to use it opposed to another language.

The JVM

The JVM provides a platform that gives applications a lot of capabilities to the applications running on it (like platform interopability, the Hotspot JIT, etc.).

rich standard library

The JVM standard library already provides functionaltiy for most common tasks encountered by programmers. This includes the Collections framework, a datetime API, various concurrency utilities and many others.

extensive ecosystem

Java is used a lot. This includes commercial applications (e.g. backends of companies), researchers, use in education, library development, all kinds of Android apps and many other things. This usage results in problems already being solved many times before, there being multiple vendors for JDKs, support for various cryptographic algorithms and corresponding tooling, companies being able to get proper commercial support from multiple vendors etc.

Aside from the standard library and related tooling, there are also a vast amount of external libraries for almost almost any general task I could use a library for.

advanced diagnosis/profiling tools

There are also many diagnostic tools available for the JVM starting with stack traces/thread dumps, JDK Flight Recorder and the tools to analyze JFR recordings, async-profiler, jol (Java object layout), JMH (Java measurement harness) or tools for analyzing heap dumps which can also be used from IDEs. These are just a few examples as a lot of tooling has been written for Java applications over time.

proper debugging capabilities

Aside from basic debugging features available with most languages, I can also make use of more advanced debugging capabilities when working on Java code. This includes hot-swapping application code if signatures didn't change or attaching a debugger to a (production) build that's running with all optimizations enabled and still be able to use breakpoints and access (reading from but also writing to) even unused variables that are otherwise eliminated, using tracepoints or conditional breakpoints (with these conditions being Java expressions that can make use of anything the application can make use of) or similar. In the JVM, these capabilities are available due to a process called deoptimization.

GC

Java uses mark-and-sweep garbage collection which is the most convenient way of memory management. It's safe and developers don't have to think about memory management and ensures that unused objects are deallocated when necessary.

For example, if I have a GC, I can create a general graph datastructure by modelling nodes with references to their neighbors (without a node or edge list!) and if I remove nodes (possibly in a multithreaded environment using AtomicReferences/compare-and-swap or similar) that are then no longer referenced, the GC takes care of freeing the memory. This is not possible without a (runtime) garbage collector. Whether this is a good way of modelling data depends on the exact use-case.

A scorecard for languages

Here is a scorecard for programming languages inspired by the Joel test. This scorecard assigns a score between 0 and 17 points (both included) to a language. It consists of multiple yes/no questions and each question answered with "yes" corresponds to the amount of points next to the question being given to that language. The result tells you how interested I might be to hear about the language. If the score is below 10, I'm probably not interested to hear about the language. If the score is 15 or above, feel free to tell me about it (but telling me once is enough).

These questions can be considered as "Can I write code like I do in Java?" or are about not having stuff that is annoying me about other languages.

  • OOP
    • Does the language support classes with methods being part of the class (not declared next to classes)? ($\frac{1}{2}$ P)
    • Can subclasses override methods such that the superclass calling that method calls the overridden method? ($\frac{1}{2}$ P)
  • dynamic capabilities
    • Is it possible to introspect classes/methods/functions/etc by name? ($\frac{1}{2}$ P)
    • Is it possible to load new code at runtime in a way that this code can access any other code that's part of the application? ($\frac{1}{2}$ P)
  • type system
    • Does the language use a static type system (meaning that if a variable is declared as String, it will always be a String)? ($\frac{1}{2}$ P)
    • Does the language include a mechanism for dynamic dispatch (this doesn't need to be based on the receiver type/OOP is not a precondition of this)? ($\frac{1}{2}$ P)
  • Does the language use nominal typing by default? (1 P)
  • Does the language have checked exceptions (special return values don't count)? (1 P)
  • Does the language use C-like syntax for function/method declaration, invocation and basic control flow? (1 P)
  • Does the language not use abbreviations for the most important keywords (this point should be awarded only for languages that don't use abbreviations)? (1 P)
  • Can the location (target file) of the declaration of a class/method/function/variable (normally) be inferred just from the information present in the file where it is accessed? (1 P)
  • Can functions/methods not be reassigned (this point should be awarded only for languages that don't allow reassigning functions/methods)? (1 P)
  • Is there a proper Open Source IDE (VSC, Vim, Emacs etc. are considered text editors and don't count) supporting the language which is available on Linux, MacOS and Windows? (1 P)
  • Does the language not have unrestricted operator overloading (if the language has unrestricted operator overloading, this point shouldn't be awarded)? (1 P)
  • Are devs using the language not trying to convince devs using other languages to use the said language (if devs of that language are going to devs of other languages and telling them to use their preferred language, this point shouldn't be avoided)? (1 P)
  • standard library
    • Does the standard library contain the most commonly used collections (e.g. Lists, Maps, etc.)? ($\frac{1}{3}$ P)
    • Does the standard library contain concurrency utilities (e.g. locks, semaphores, mechanisms for Compare-And-Swap)? ($\frac{1}{3}$ P)
    • Does the standard library come with a date/time API? ($\frac{1}{3}$ P)
  • ecosystem/production-readiness
    • Are there commercial support offerings for the language (preferably from companies making significant contributions to the language)? ($\frac{1}{2}$ P)
    • Are there multiple vendors? ($\frac{1}{2}$ P)
  • diagnosis/profiling
    • Are there language-aware tools for inspecting memory used by applications written in that language? ($\frac{1}{2}$ P)
    • When a runtime error (e.g. an assumption the developer made is violated) occurs in a program written in the language, is a stack trace with names and line numbers logged? ($\frac{1}{2}$ P)
  • debugging/development
    • Is hot-swapping parts of the application possible when debugging? ($\frac{1}{2}$ P)
    • Is it possible to attach a debugger at runtime and then access unused variables/step through unused/eliminated code? ($\frac{1}{2}$ P)
  • Does the language use a GC (reference counting GCs don't count)? (1 P)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment