Inko is an interesting language, though the handling for panics sounds disturbing. Based on the documentation, panics abort the whole program and are not recoverable.
Things like dividing by zero or accessing an index outside of the bounds of an array will panic.
While these aren’t incredibly common, I would never want my whole production web server to go down because some random, infrequently used endpoint had a bug in some edge case.
The language seems heavily inspired by Erlang, which is very resilient to programmer errors in comparison.
In Erlang, the philosophy is to let a lightweight process crash and restart it when there is a problem. This doesn’t need to have any effect on other lightweight processes that are running.
YorickPeterse 35 days ago [-]
Panics are meant to be used sparingly and as a last resort, and are meant to signal "this code is utterly broken, go fix it". Handling such errors at runtime (e.g. by restarting just a lightweight process) doesn't make much sense, as you'll keep running into the issue over and over again and possibly even end up ignoring it.
For runtime error handling you'd use algebraic types such as Result and Option, similar to most functional languages and Rust.
The reason we don't allow catching of panics (such as Rust allows) is because it can result in people more or less ignoring such errors (something I've seen happen far too often in past projects) and just retrying over and over again, achieving the same result every time. I'd much rather have the program crash and scream loudly, forcing developers to address the issue. Fortunately, the chances of you actually running into a panic at runtime should be pretty slim, so not being able to catch them shouldn't be much of an issue in practice.
brokencode 35 days ago [-]
Thats fair, and it’s your language to design how you think is best. Just throwing out my two cents here based on my personal experiences.
I have 2000+ developers at my work, many of them making changes on a single huge web application. Compile/publish of the testing servers takes several hours and happens regularly throughout the day. There are thousands of testers all testing many different workflows all the time.
If one developer makes a mistake in a dark corner of the code one day, it seems bad to have the whole server crashing and disrupting all the other testing that’s going on for multiple hours while we wait for a fix and another build.
Quality and fix prioritization is in the hands of project teams for usability problems, performance problems, functional problems, etc. Uncaught exceptions should be logged somewhere, and it’s ultimately up to the team to decide whether they care enough to fix them.
someone654 35 days ago [-]
> have the whole server crashing
You do not need to solve that at the language level. A common pattern is using multiple replicas, service discovery and automatic restart upon a dead replica. For example kubernetes does this out of the box.
For dark corners of the code, that is often a good middle ground between interrupting the service and never die.
brokencode 35 days ago [-]
Sure, it's a good idea to have something like that just in case. But crashing is really not desirable and could cause all kinds of problems, even if you do have recovery mechanisms in place.
A crash in a web server could result in data loss, timeouts and performance issues on clients due to the response never coming back, crashes on poorly written clients, cascading failures in poorly written services, etc.
And for a client application, imagine if your web browser would just randomly crash sometimes. Do you think it would make it much better if the browser would detect the crash and relaunch itself? And what if it relaunches itself and then crashes again for the same reason?
nicoburns 35 days ago [-]
I feel like if that's your take, then things like array indexing or dividing by zero should never panic. Because running into those things at runtime is very hard to prevent (unless you have tooling for enforcing that statically?)
YorickPeterse 35 days ago [-]
There are cases where you might want to choose: crash and burn, or handle at runtime. Where this makes sense, Inko does actually let you do that. For example, array indexing is done using one of two methods:
- Array.get/Array.get_mut for when you want to panic on an out-of-bounds index
- Array.opt/Array.opt_mut for when you want an Option type, with it being a None for an out of bounds index
This pattern is applied across the standard library where this makes sense, such that the developer can pick which option works best for their needs.
There are a few places where we check for some input at runtime and panic if this is invalid, but this is limited to cases where there's simply no sensible alternative (e.g. providing a key with an invalid size to a ChaCha20 cipher).
dullcrisp 35 days ago [-]
If panics can’t be handled at runtime they need to be impossible to introduce. Saying they just need to be fixed immediately might make sense for a project with a handful of developers but as others have said a major production system failing due to a panic wouldn’t be acceptable.
Too 35 days ago [-]
With those two names it's rather inevitable that any beginner of the language will use the obvious sounding, but panic-inducing get.
That's not scalable if it's not possible to catch panics. If panics can't be handled they must only be caused by super corner 0.001% cases, not by fundamental things like dividing an integer or indexing an array.
Muximize 35 days ago [-]
I think this is bad api design. The opt method should not exist, and the get method should return option.
35 days ago [-]
RossBencina 35 days ago [-]
On the topic of "sparingly and as a last resort" I agree, and I think there is more to it. There are situations where it makes sense to panic and there are situations where it doesn't make sense. There can be policies that guide when panic is/isn't appropriate.
In an open source library that I maintain, the C code makes use of asserts. These asserts are essentially unrecoverable panics on constraint/invariant violation. The effect of an assert failure is that violation renders the code unrunnable ("utterly broken"). Unfortunately "panic, go fix it" turns out to not be a great fit for a widely distributed library for at least three reasons: (1) the user of a library is often not in a position to fix it, (2) the distribution is so far removed from the maintainers that the maintainers may or may not get the message to fix it, and even if they can fix it the turnaround time to the next distro release doesn't help the current user very much, (3) if the constraint violations can be triggered by external code then we are essentially introducing "transitive panic" for conditions that we can't possibly fix (we don't control or have visibility into external code, such as Windows APIs and drivers).
The upshot of this is that our policies are to only panic (i.e. use an assert) to check conditions/constraints/invariants that are under the full control of our own code base (never to confirm assumptions about externally supplied data). If we are in a context where it is practicable, prefer raising error conditions such as "internal error" or "unexpected native API response".
With the above in mind, I disagree with your point about supervision trees and restarting lightweight processes. Requiring panics to crash the whole process is the most extreme outcome possible, and can potentially lead to an extremely unsatisfactory user experience. Software is almost always performing many more functions than the little piece of code that panics. I find it difficult to believe that killing the whole process (i.e. unrecoverable panic) is universally the best course of action. Killing and permanently disabling the lightweight process would at least allow the rest of the program a chance to continue in some kind of limp mode until the issue can be fixed.
Looking at it another way, by killing the whole process, you are effectively limiting use to those cases where killing the whole process is indeed the only appropriate course of action.
On "you'll keep running into the issue over and over again and possibly even end up ignoring it" To me this smells over-opinionated and poorly justified. So long as there is good error reporting and diagnostics tracing, the user of the language can and should decide these things. But maybe it can be better justified and maybe there are plenty of use-cases that are fine with "no soup for you!" at the first sign of trouble.
patrick451 35 days ago [-]
I hope nobody ever uses inko for flight control software. Oh, your plane crashed because some hip language won't allow catching a panic? Don't worry, the developers are fixing it for the next release.
Sorry, but limping along is better than crashing.
dllthomas 23 days ago [-]
Limping along is sometimes better than crashing. It's highly context dependent.
35 days ago [-]
BobaFloutist 35 days ago [-]
Ok so I'm a bit of a coding noob, but couldn't you hypothetically catch panics before they happen by having some boilerplate you insert in every case that could cause one?
I guess checking to make sure you aren't dividing by zero before you divide every time or checking an index access against the bounds of an array could have some performance implications, and also it would be prohibitively hard to actually enforce it across your team?
brokencode 35 days ago [-]
Yup, that would be the fix. It is a logical error to even try to be dividing by zero, so your code should handle it. Regardless of whether there is a panic, this is what you'd want to do.
But the problem is that people make mistakes or forget every once in a while, and as you get more people working on large programs, these mistakes happen more often.
So you have to assume that it'll most likely happen in your production software sooner or later. Crashing the whole program is a fairly catastrophic outcome in my opinion, and should be reserved for only truly unrecoverable problems like hardware failures.
rurban 34 days ago [-]
And how do want to deal with fatal errors?
Overflows and divide by zero are easily avoidable and should be fatal.
Even the Apollo lander program aborted and even caused a reboot. Such that the reboot loop caused an uncontrollable lander, just by overloading the IO. The scheduler could have killed the radar process, but this was Aldrin's job, and he failed, not the scheduler.
paulddraper 35 days ago [-]
> I would never want my whole production web server to go down because some random, infrequently used endpoint had a bug in some edge case.
Then you need an alternative.
And that alternative is...messy. Do you have exceptions? Do you type them? Or do you have explicit return values? Is there syntax sugar? Etc, etc.
brokencode 35 days ago [-]
All error handling is messy in one way or another. I am personally a fan of the exceptions in C#. It’s good enough for me and doesn’t make me add a bunch of error types to all my methods.
I also like the Erlang philosophy of letting the lightweight process crash as soon as any problem occurs and having a supervisor decide what to do.
nixpulvis 35 days ago [-]
That's why you have wrappers that return Result enumerations. There is an overhead to check if an operation is valid, pay for it if you need it.
This is exactly how Rust works, for example.
brokencode 35 days ago [-]
How is it knowing to panic if it isn’t checking? It seems like I am paying the cost of doing the checking either way, at least in terms of performance. So it should give me the ability to handle the error somehow.
As a C# and Typescript developer primarily, this concept of killing the whole program due to a bug anywhere in any part of the code is very foreign to me.
I’d expect that to only happen if you ran out of memory, had a hardware failure, or other such catastrophic error. To me, dividing by zero is a problem, but not worse than pretty much any other type of logical error that doesn’t panic.
paulddraper 35 days ago [-]
> killing the whole program due to a bug anywhere in any part of the code is very foreign to me.
But getting the wrong answer due to a bug anywhere in any part of the code is not?
brokencode 35 days ago [-]
Programs have all kinds of bugs for all kinds of reasons. That’s why you need unit testing, functional testing, performance testing, usability testing, accessibility testing, etc.
Why single out just a few types of bugs to crash the whole program? Let me catch it and decide what to do.
If it’s in a single request to my web server, I’d like that single request return an error code. I would not like all other concurrent requests for all other users to fail as well.
ludwik 35 days ago [-]
Allowing you to catch and handle an exception is not the same as silently returning a wrong answer.
caspper69 35 days ago [-]
It seems to me that exceptions have a bad rep.
Complaints range from too slow to hidden control flow to just not liking them.
I absolutely see the benefit in application code, otherwise you're error checking every function call.
On the other hand, I completely understand not using such a mechanism in kernel and driver code.
But Rust has decided against exceptions (which I actually believe were part of the original vision way back when, but please don't quote me), so now there's panics.
Everything is a tradeoff at some level.
paulddraper 35 days ago [-]
Yes, the wrong answer is terrifying
35 days ago [-]
demurgos 35 days ago [-]
Rust's panics are recoverable, so they're more like exceptions. This means that you may need to care about "unwind" safety and values in a potentially invalid state. It's the main reason for mutex poisoning and having to deal with it when locking.
Overall, I'm not entirely sure if Rust would be better or not if panics were non-recoverable.
PS: For completeness, there are flags to control the panic behavior; but you can't rely on them when writing a lib.
> It is not recommended to use this function for a general try/catch mechanism. The Result type is more appropriate to use for functions that can fail on a regular basis. Additionally, this function is not guaranteed to catch all panics, see the “Notes” section below.
I think panics in Rust are intended to be unrecoverable, and catch_unwind is mainly intended to make the shutdown more orderly and possibly localized. Any data structure that paniced is probably in an unknown state that is not safe for further use.
brokencode 35 days ago [-]
The situation in Rust seems to be a bit complicated. But I did find that panicking within a thread you spawn kills just that thread and not the whole program. And it seems that Tokio has the same behavior.
So I don’t think panics are necessarily meant to be program-killing in Rust, even if Result types are heavily recommended instead.
Too 35 days ago [-]
https://doc.rust-lang.org/nomicon/unwinding.html explains the philosophy and how they got there a bit further. Rust grew from a short-lived task concept, where panic of a task wouldn't be the end of the world, then it got used to write bigger applications instead and finer catch-concepts were needed.
J_Shelby_J 34 days ago [-]
Catch unwind is useful when you have a dependency that for some reason panics and you can’t control it.
ryang2718 35 days ago [-]
Unless a library developer decides to abort on panic in the `toml`, then i don’t believe you can unwind.
pjc50 36 days ago [-]
This article seems to use "borrow" to mean what I would normally understand to be the reference count of a refcount-gc system? Rather than a Rust borrow, which is part of the type system and not counted at runtime.
In Inko a borrow is sort of a hybrid in that it does increment a counter, but there's no shared ownership like there is in traditional reference counting (e.g. a borrow is never tasked with cleaning up data).
Terminology is a bit annoying here, as e.g. "reference" can also be interpreted as "a pointer", though it depends on who you ask. I stuck with "borrow" because that just made the most sense to me :)
thomasmg 36 days ago [-]
> borrow is never tasked with cleaning up data
And the reason for this is that there is no branch needed at the end of the borrow, to check for refCount=0? (Or, is this at least one of the reasons?) I'm wondering about the performance impact for this... There's also the code size impact.
YorickPeterse 36 days ago [-]
Not quite. A borrow doesn't do cleanup because it's, well, a borrow, i.e. a temporary reference to some owned value. It's the owned value that's tasked with disposing of any resources/memory when it goes out of scope, just as in Rust.
The cost of borrowing (at least with heap types) is an increment upon creating the borrow, and a decrement upon disposing of the borrow. Over time I'd like to optimize those away, but that's not implemented at this time.
harrison_clarke 36 days ago [-]
you can stack allocate with just `struct` in C#. if you put a struct in a local variable, it'll be on the stack
`ref struct` allows that struct to contain refs, and disallows putting it on the heap (so, you can't have an array/list of ref structs, and a class or a normal struct can't have a ref struct field)
neonsunset 36 days ago [-]
This is correct. Ref structs and refs are also subject to similar[0] to Rust's lifetime analysis to prevent them from being ever invalidated and having use-after-free (even if that free is essentially going out of scope or popping a stack frame).
I assume pjc50 specifically refers to the fact that `ref struct` gives a strong guarantee while a regular struct can be placed wherever.
i think of it like having rust's rules, but you only get one (implicit) lifetime variable
(that's probably wrong in some subtle ways)
pjmlp 36 days ago [-]
Or stackalloc instead of new.
Rohansi 35 days ago [-]
Moreso for arrays
ivanjermakov 36 days ago [-]
Author introduces inline type definition. Shouldn't allocation strategy be decided by the caller, not type definiton? Or there is a way to heap allocate value of inline type by wrapping it into some utility type, similar to Rust's Box?
ninkendo 35 days ago [-]
> Shouldn't allocation strategy be decided by the caller, not type definiton?
Yes.
Swift made this mistake too. Classes are always heap allocated (and passed by reference and refcounted) and structs are always stack allocated (and passed by value).
It makes for a super awkward time trying to abstractly define a data model: you need to predict how people will be using your types, and how they’re used affects whether you should use struct or class.
The “swifty” way to do it is to just eschew class altogether and just always use struct, but with the enormous caveat that only classes can interop with ObjC. So you end up in an awkward state where if you need to send anything to an ObjC API, it has to be a class, and once you have a lot of classes, it starts to be more expensive to hold pointers to them inside structs, since it means you need to do a whole lot of incrementing/decrementing of refcounts. Because if you have a struct with 10 members that are all classes, you need to incur 10 refcount bumps to pass it to a function. Which means you may as well make that holding type a class, so that you only incur one refcount bump. But that just makes the problem worse, and you end up with code bases with all classes, even though that’s not the “swifty” way of doing things.
Rust did it right: there’s only one way to define a piece of data, and if you want it on the heap, you put it in a Box. If you want to share it, you put it in an Arc.
Too 35 days ago [-]
C# also has exactly the same concept. "class"-types are put on the heap and get assigned by reference, while "struct"-types are assigned by copy.
It always seemed weird to me. You need to know which data type you are working with when passing things around and can not adjust according to the application. For certain types from the standard library it makes sense, like datetimes you probably want always be copied. But when you work in a big team where everybody has their own style and special optimization corner case, it decays quickly.
YorickPeterse 36 days ago [-]
> Or there is a way to heap allocate value of inline type by wrapping it into some utility type, similar to Rust's Box?
There isn't. For this to work, functions for inline types have to be compiled such that one of these approaches is used:
1. They always take inline values by pointer, and we have to guarantee those pointers are never invalidated. This again means you need some sort of borrow checking scheme.
2. We compile two versions for each function: one that takes the data by value, and one by pointer, resulting in significant code bloat and compiler complexity.
I think the split also better captures the intent: heap types are for cyclic and heavily mutated values, inline types are more for short-lived mostly immutable values.
dcrazy 36 days ago [-]
> Shouldn't allocation strategy be decided by the caller, not type definiton?
This is how C++ is designed. Unfortunately, it precludes types from taking dependencies on their own address, which is critical for e.g. atomics. As far as I know, there is no way to actually force a C++ class to be heap allocated. I’ve tried.
Newer languages like Swift give the type designer the ability to say “the address of this object in memory is important and must remain stable for the lifetime of this object.” This decays to heap allocation.
badmintonbaseba 36 days ago [-]
> As far as I know, there is no way to actually force a C++ class to be heap allocated.
Make the destructor (and possibly the constructors) private, have public factory functions that hand out unique_ptr with a custom deleter, or define a public destroying operator delete (C++20) and use unique_ptr with its default deleter.
mtklein 36 days ago [-]
The idiom I am familiar with here is to make the constructor private and provide a public static factory method that allocates on the heap.
spacechild1 36 days ago [-]
> Unfortunately, it precludes types from taking dependencies on their own address,
It does not! You just have to make it non-copyable and non-moveable. std::mutex and std::atomic do exactly this.
> As far as I know, there is no way to actually force a C++ class to be heap allocated. I’ve tried.
As others have already pointed out: private constructor + factory method. However, this pattern is typically used for different reasons (implementation hiding or runtime polymorphism) and heap allocation is only a side effect.
> This decays to heap allocation.
No! The object might just as well live on the stack or have static storage duration.
slaymaker1907 35 days ago [-]
The only thing I can think of that really matters which can’t be solved by just wrapping the data in an inner heap allocation (like std::vector) is with the pointer to the vtable for virtual function calls.
If anything, I’ve found it’s more useful when I want to bypass such wrapping to force a class to use some particular memory (like from a bulk allocation). The STL is pretty good, but there are warts that which still force the default heap allocator like std::function.
Maxatar 35 days ago [-]
C++ does have ways to force heap allocation, but frankly it's just an antipattern.
It's easier to reason about value semantics than reference semantics. If you want a class with a stable address that is heap allocated, then do so by writing class that has a private field that is heap allocated, preferably managed by a unique_ptr. Then disable copy assignment and construction and make a judgement call if you want to support move assignment, construction.
In effect, the idea is to make this aspect of the type an implementation detail that the user doesn't need to concern themselves with. They just create an instance of the type, and the type deals with the ownership or ensuring that the address is pinned, or whatever other details are needed to make it work. And of course the type does this by delegating that work to unique_ptr or some other means of ownership.
jayd16 36 days ago [-]
Hmm, if it was by the caller then methods on the inline type can't know if members are copied or shared in assignment, I wouldn't think.
Seems trivial to box the inline types like most languages do.
etyp 36 days ago [-]
I like how this is structured. When I read that inline types get copied-on-borrow I was pretty put off. Then since fields of inline types can't be assigned new values it seems a bit better, as long as you roughly know what's happening. Hopefully the diagnostics are good enough there. I like the detailed alternatives that weren't chosen.
I appreciate being able to choose which side of the tradeoff (always-copy or heap allocated) you want to be on, but either way be assured it's safe. Not sure how I feel about it in practice without trying it, though :)
YorickPeterse 36 days ago [-]
On the diagnostics side of things, the compiler produces these in two places:
1. If you try to define a `mut` field (= one you can normally assign a new value), it will produce an error at the definition site
2. If you try to do something like `some_inline_value.field = value` it also produces an error at the call/assignment site
The actual messages could use some improvements, but that's not a super high priority for the time being as I'm more focusing on the language overall.
36 days ago [-]
hinkley 36 days ago [-]
My entirely made up origin story of borrow checked is that escape analysis (for GC) made huge progress in the 00’s and borrow checkers are just the logical conclusion of having that shoulder to stand on.
I don’t know what inspired borrow checking but I am certain someone else would have thought it up presently if they hadn’t.
panstromek 36 days ago [-]
I think the Rust borrow checker is inspired by Cyclone. Not sure how that coincides with the GC development timeline
mkehrt 36 days ago [-]
Exactly correct, I think. Cyclone allowed borrowing of "regions" (which were similar to rust lifetimes) in a very similar way. This was either based on or itself inspired other theoretical models of borrowing at around the same time; I'm not sure on the causality but I read all the literature at the time!
caspper69 35 days ago [-]
In my research a few weeks back, I went down the rabbit hole of region based memory management, and Cyclone was one of the first languages I came across (the majority of academic papers on the topic retrofitted an existing language- usually C).
I might be wrong here, so please feel free to correct me if so, but I don't think borrowing was a concept, per se, of the language itself.
As you mention, the concept the Rust designers took from Cyclone was explicit lifetimes.
Borrow checking provides two features (but in my opinion in a very un-ergonomic way): (1) prevention of use after free; and (2) temporal memory safety (i.e. guaranteeing no data races, but not eliminating race conditions in general).
I'm still wobbly on PLT legs though; I'm sure there's a pro or ten who could step in and elaborate.
cwzwarich 35 days ago [-]
Cyclone had borrowing. See Section 4.4 (Temporary Aliasing) in the paper
As their citations indicate, the idea of borrowing appeared immediately in the application of subtructural logics to programming languages, back to Wadler's "Linear types can change the world!". It's just too painful without it.
caspper69 35 days ago [-]
Thank you.
I appreciate the follow up and references.
Now I've got some Friday night reading :)
panstromek 33 days ago [-]
Curiously, Rust compiler still uses the "region" terminology internally for lifetimes.
36 days ago [-]
pcwalton 35 days ago [-]
The borrow checker is more influenced by substructural type systems, such as the one in the MLKit [1] and Cyclone.
A nicely written article! And an interesting project.
Myself, I'd lean towards a sound (linear) type theory. If it's not too much trouble, insert the run-time checks in debug builds but use the type system to erase them for optimized builds. It might seem like the mountain is impossible to climb if you're not used to formalizing such systems but every mountain is surmounted one step at a time.
It's hard to bolt-on correctness after the fact. In my experience, for critical pieces like this, it's better to get the specification right first before digging in too deep and writing code.
Best of luck on the project either way you go. Memory safety is increasingly important!
YorickPeterse 36 days ago [-]
Although linear typing is certainly interesting, I think linear typing on its own is not enough. That is, you basically end up with something like Austral where you use linear types _plus_ some form of compile-time borrow checking, at which point we're back at borrow checking (did I mention it seems inevitable?) :)
agentultra 36 days ago [-]
You did! And maybe it is inevitable? I don't know.
It'd be interesting to see different theories evolve, for sure. Maybe something in separation logic will make it's way into mainstream type theory and into compilers at some point.
What's cool is how much we're starting to see folks push the boundaries these days. :)
Kinrany 36 days ago [-]
Wouldn't the borrow checking be simpler when built on top of linear types?
YorickPeterse 36 days ago [-]
I don't think the choice of linear vs affine makes much of a difference, but I could be mistaken.
sunshowers 36 days ago [-]
I think a real challenge with trying to work on specifications first is error handling -- you often find that a truly sound model is quite difficult to explain to users. So some prototyping and iteration becomes necessary in my experience.
Like, rustc only recently gained the ability to explain that the borrow checker rejected a program because a lifetime parameter was invariant. And this isn't even an artificial bit of complexity -- if you have mutability you are required to think about variance. If you have a Cell<&'static str> you cannot just turn that into a Cell<&'a str>, the way you can turn a regular &'static str into a &'a str. (Java programmers might be familiar with similar issues around ArrayList<Object>.)
harrison_clarke 36 days ago [-]
something i'd like to see in a borrow checker (which i think causes problems in rust, because of the "leaking is safe" thing. which sounds like it could be a difficult hole to plug in language design):
in rust, &mut means "this pointer points to an initialized value, and will point to an initalized value at the end of scope"
i wish it also had &in, &out, and &tmp, for "initialized->uninitialized", "uninitialized->initialized", and "uninitialized->uninitialized"
masklinn 36 days ago [-]
Their intra-function behaviour seems difficult to define for non trivial cases. For instance a &tmp would need to be write only until it’s written to, then it can be read from as well, but it needs to be consumed before its scope ends, transitioning back to a write only reference. So you’d need a type system which can transition parameters through typestates (is that a subset of effects systems or unrelated?).
wavemode 36 days ago [-]
What you're describing is exactly correct - you need a more robust "typestate" system (I call them "annotations"). Most languages have typestates - where you can, for example, declare a variable without immediately assigning it a value, but that variable remains write-only until it is assigned to.
But these typestate systems aren't very elaborate usually. I've recently been doing research on the (hypothetical) design of a language which has typestates that can cross function boundaries - you can call a function that annotates that it uninitializes one of its arguments, and then that reference you passed in is now considered uninitialized in your local scope.
quotemstr 34 days ago [-]
All this just to avoid having a GC? One day we're going to have to stop fetishizing manual memory management.
Thanks for pointing me to that. Upon reflection, there's a further conversation to be had in PL design:
> The rationale for this is that at some point, all garbage collected languages run into the same issue: the workload is too great for the garbage collector to keep up.
What this actually means is "The program I wrote is a poor fit for the garbage collector I am using." which can be fixed by either changing the program or changing the garbage collector. People often focus on the latter and forget the former is a possibility[1].
Similarly with single-ownership and borrows, you can write a program that is fighting the memory management system (see e.g. any article on writing a doubly-linked list in Rust).
In other structured memory allocation systems (hierarchical, pools &c.), the memory allocation lends itself well to certain architectures of code.
As far as I know, nobody has done a comparison of various memory management systems and how they enable and hinder various forms of program design.
1: The "time" special operator in SBCL shows allocation and GC time statistics by default. I think this nudges people into thinking about how their program is allocating at exactly the right time: when they are worried about how long something is taking.
pkulak 35 days ago [-]
> Inko doesn't rely on garbage collection to manage memory.
Reference counting is garbage collection. I've been on this hill for years, and I am prepared to die here.
35 days ago [-]
ej1 36 days ago [-]
[flagged]
isoprophlex 36 days ago [-]
Your bot can go fuck itself, the internet is zombiefied by undead AIs enough as it is.
Things like dividing by zero or accessing an index outside of the bounds of an array will panic.
While these aren’t incredibly common, I would never want my whole production web server to go down because some random, infrequently used endpoint had a bug in some edge case.
The language seems heavily inspired by Erlang, which is very resilient to programmer errors in comparison.
In Erlang, the philosophy is to let a lightweight process crash and restart it when there is a problem. This doesn’t need to have any effect on other lightweight processes that are running.
For runtime error handling you'd use algebraic types such as Result and Option, similar to most functional languages and Rust.
The reason we don't allow catching of panics (such as Rust allows) is because it can result in people more or less ignoring such errors (something I've seen happen far too often in past projects) and just retrying over and over again, achieving the same result every time. I'd much rather have the program crash and scream loudly, forcing developers to address the issue. Fortunately, the chances of you actually running into a panic at runtime should be pretty slim, so not being able to catch them shouldn't be much of an issue in practice.
I have 2000+ developers at my work, many of them making changes on a single huge web application. Compile/publish of the testing servers takes several hours and happens regularly throughout the day. There are thousands of testers all testing many different workflows all the time.
If one developer makes a mistake in a dark corner of the code one day, it seems bad to have the whole server crashing and disrupting all the other testing that’s going on for multiple hours while we wait for a fix and another build.
Quality and fix prioritization is in the hands of project teams for usability problems, performance problems, functional problems, etc. Uncaught exceptions should be logged somewhere, and it’s ultimately up to the team to decide whether they care enough to fix them.
You do not need to solve that at the language level. A common pattern is using multiple replicas, service discovery and automatic restart upon a dead replica. For example kubernetes does this out of the box.
For dark corners of the code, that is often a good middle ground between interrupting the service and never die.
A crash in a web server could result in data loss, timeouts and performance issues on clients due to the response never coming back, crashes on poorly written clients, cascading failures in poorly written services, etc.
And for a client application, imagine if your web browser would just randomly crash sometimes. Do you think it would make it much better if the browser would detect the crash and relaunch itself? And what if it relaunches itself and then crashes again for the same reason?
- Array.get/Array.get_mut for when you want to panic on an out-of-bounds index
- Array.opt/Array.opt_mut for when you want an Option type, with it being a None for an out of bounds index
So for example:
This pattern is applied across the standard library where this makes sense, such that the developer can pick which option works best for their needs.There are a few places where we check for some input at runtime and panic if this is invalid, but this is limited to cases where there's simply no sensible alternative (e.g. providing a key with an invalid size to a ChaCha20 cipher).
That's not scalable if it's not possible to catch panics. If panics can't be handled they must only be caused by super corner 0.001% cases, not by fundamental things like dividing an integer or indexing an array.
In an open source library that I maintain, the C code makes use of asserts. These asserts are essentially unrecoverable panics on constraint/invariant violation. The effect of an assert failure is that violation renders the code unrunnable ("utterly broken"). Unfortunately "panic, go fix it" turns out to not be a great fit for a widely distributed library for at least three reasons: (1) the user of a library is often not in a position to fix it, (2) the distribution is so far removed from the maintainers that the maintainers may or may not get the message to fix it, and even if they can fix it the turnaround time to the next distro release doesn't help the current user very much, (3) if the constraint violations can be triggered by external code then we are essentially introducing "transitive panic" for conditions that we can't possibly fix (we don't control or have visibility into external code, such as Windows APIs and drivers).
The upshot of this is that our policies are to only panic (i.e. use an assert) to check conditions/constraints/invariants that are under the full control of our own code base (never to confirm assumptions about externally supplied data). If we are in a context where it is practicable, prefer raising error conditions such as "internal error" or "unexpected native API response".
With the above in mind, I disagree with your point about supervision trees and restarting lightweight processes. Requiring panics to crash the whole process is the most extreme outcome possible, and can potentially lead to an extremely unsatisfactory user experience. Software is almost always performing many more functions than the little piece of code that panics. I find it difficult to believe that killing the whole process (i.e. unrecoverable panic) is universally the best course of action. Killing and permanently disabling the lightweight process would at least allow the rest of the program a chance to continue in some kind of limp mode until the issue can be fixed.
Looking at it another way, by killing the whole process, you are effectively limiting use to those cases where killing the whole process is indeed the only appropriate course of action.
On "you'll keep running into the issue over and over again and possibly even end up ignoring it" To me this smells over-opinionated and poorly justified. So long as there is good error reporting and diagnostics tracing, the user of the language can and should decide these things. But maybe it can be better justified and maybe there are plenty of use-cases that are fine with "no soup for you!" at the first sign of trouble.
Sorry, but limping along is better than crashing.
I guess checking to make sure you aren't dividing by zero before you divide every time or checking an index access against the bounds of an array could have some performance implications, and also it would be prohibitively hard to actually enforce it across your team?
But the problem is that people make mistakes or forget every once in a while, and as you get more people working on large programs, these mistakes happen more often.
So you have to assume that it'll most likely happen in your production software sooner or later. Crashing the whole program is a fairly catastrophic outcome in my opinion, and should be reserved for only truly unrecoverable problems like hardware failures.
Overflows and divide by zero are easily avoidable and should be fatal.
Even the Apollo lander program aborted and even caused a reboot. Such that the reboot loop caused an uncontrollable lander, just by overloading the IO. The scheduler could have killed the radar process, but this was Aldrin's job, and he failed, not the scheduler.
Then you need an alternative.
And that alternative is...messy. Do you have exceptions? Do you type them? Or do you have explicit return values? Is there syntax sugar? Etc, etc.
I also like the Erlang philosophy of letting the lightweight process crash as soon as any problem occurs and having a supervisor decide what to do.
This is exactly how Rust works, for example.
As a C# and Typescript developer primarily, this concept of killing the whole program due to a bug anywhere in any part of the code is very foreign to me.
I’d expect that to only happen if you ran out of memory, had a hardware failure, or other such catastrophic error. To me, dividing by zero is a problem, but not worse than pretty much any other type of logical error that doesn’t panic.
But getting the wrong answer due to a bug anywhere in any part of the code is not?
Why single out just a few types of bugs to crash the whole program? Let me catch it and decide what to do.
If it’s in a single request to my web server, I’d like that single request return an error code. I would not like all other concurrent requests for all other users to fail as well.
Complaints range from too slow to hidden control flow to just not liking them.
I absolutely see the benefit in application code, otherwise you're error checking every function call.
On the other hand, I completely understand not using such a mechanism in kernel and driver code.
But Rust has decided against exceptions (which I actually believe were part of the original vision way back when, but please don't quote me), so now there's panics.
Everything is a tradeoff at some level.
Overall, I'm not entirely sure if Rust would be better or not if panics were non-recoverable.
PS: For completeness, there are flags to control the panic behavior; but you can't rely on them when writing a lib.
> It is not recommended to use this function for a general try/catch mechanism. The Result type is more appropriate to use for functions that can fail on a regular basis. Additionally, this function is not guaranteed to catch all panics, see the “Notes” section below.
I think panics in Rust are intended to be unrecoverable, and catch_unwind is mainly intended to make the shutdown more orderly and possibly localized. Any data structure that paniced is probably in an unknown state that is not safe for further use.
https://github.com/tokio-rs/tokio/issues/2002#issuecomment-6...
And web frameworks like Tower provide standard ways of handling panics and turning them into error responses.
https://docs.rs/tower-http/latest/tower_http/catch_panic/ind...
So I don’t think panics are necessarily meant to be program-killing in Rust, even if Result types are heavily recommended instead.
In C# you can force a type to be stack allocated with "ref struct". https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...
Terminology is a bit annoying here, as e.g. "reference" can also be interpreted as "a pointer", though it depends on who you ask. I stuck with "borrow" because that just made the most sense to me :)
And the reason for this is that there is no branch needed at the end of the borrow, to check for refCount=0? (Or, is this at least one of the reasons?) I'm wondering about the performance impact for this... There's also the code size impact.
The cost of borrowing (at least with heap types) is an increment upon creating the borrow, and a decrement upon disposing of the borrow. Over time I'd like to optimize those away, but that's not implemented at this time.
`ref struct` allows that struct to contain refs, and disallows putting it on the heap (so, you can't have an array/list of ref structs, and a class or a normal struct can't have a ref struct field)
I assume pjc50 specifically refers to the fact that `ref struct` gives a strong guarantee while a regular struct can be placed wherever.
[0]: https://em-tg.github.io/csborrow/
(that's probably wrong in some subtle ways)
Yes.
Swift made this mistake too. Classes are always heap allocated (and passed by reference and refcounted) and structs are always stack allocated (and passed by value).
It makes for a super awkward time trying to abstractly define a data model: you need to predict how people will be using your types, and how they’re used affects whether you should use struct or class.
The “swifty” way to do it is to just eschew class altogether and just always use struct, but with the enormous caveat that only classes can interop with ObjC. So you end up in an awkward state where if you need to send anything to an ObjC API, it has to be a class, and once you have a lot of classes, it starts to be more expensive to hold pointers to them inside structs, since it means you need to do a whole lot of incrementing/decrementing of refcounts. Because if you have a struct with 10 members that are all classes, you need to incur 10 refcount bumps to pass it to a function. Which means you may as well make that holding type a class, so that you only incur one refcount bump. But that just makes the problem worse, and you end up with code bases with all classes, even though that’s not the “swifty” way of doing things.
Rust did it right: there’s only one way to define a piece of data, and if you want it on the heap, you put it in a Box. If you want to share it, you put it in an Arc.
It always seemed weird to me. You need to know which data type you are working with when passing things around and can not adjust according to the application. For certain types from the standard library it makes sense, like datetimes you probably want always be copied. But when you work in a big team where everybody has their own style and special optimization corner case, it decays quickly.
There isn't. For this to work, functions for inline types have to be compiled such that one of these approaches is used:
1. They always take inline values by pointer, and we have to guarantee those pointers are never invalidated. This again means you need some sort of borrow checking scheme.
2. We compile two versions for each function: one that takes the data by value, and one by pointer, resulting in significant code bloat and compiler complexity.
I think the split also better captures the intent: heap types are for cyclic and heavily mutated values, inline types are more for short-lived mostly immutable values.
This is how C++ is designed. Unfortunately, it precludes types from taking dependencies on their own address, which is critical for e.g. atomics. As far as I know, there is no way to actually force a C++ class to be heap allocated. I’ve tried.
Newer languages like Swift give the type designer the ability to say “the address of this object in memory is important and must remain stable for the lifetime of this object.” This decays to heap allocation.
Make the destructor (and possibly the constructors) private, have public factory functions that hand out unique_ptr with a custom deleter, or define a public destroying operator delete (C++20) and use unique_ptr with its default deleter.
It does not! You just have to make it non-copyable and non-moveable. std::mutex and std::atomic do exactly this.
> As far as I know, there is no way to actually force a C++ class to be heap allocated. I’ve tried.
As others have already pointed out: private constructor + factory method. However, this pattern is typically used for different reasons (implementation hiding or runtime polymorphism) and heap allocation is only a side effect.
> This decays to heap allocation.
No! The object might just as well live on the stack or have static storage duration.
If anything, I’ve found it’s more useful when I want to bypass such wrapping to force a class to use some particular memory (like from a bulk allocation). The STL is pretty good, but there are warts that which still force the default heap allocator like std::function.
It's easier to reason about value semantics than reference semantics. If you want a class with a stable address that is heap allocated, then do so by writing class that has a private field that is heap allocated, preferably managed by a unique_ptr. Then disable copy assignment and construction and make a judgement call if you want to support move assignment, construction.
In effect, the idea is to make this aspect of the type an implementation detail that the user doesn't need to concern themselves with. They just create an instance of the type, and the type deals with the ownership or ensuring that the address is pinned, or whatever other details are needed to make it work. And of course the type does this by delegating that work to unique_ptr or some other means of ownership.
Seems trivial to box the inline types like most languages do.
I appreciate being able to choose which side of the tradeoff (always-copy or heap allocated) you want to be on, but either way be assured it's safe. Not sure how I feel about it in practice without trying it, though :)
1. If you try to define a `mut` field (= one you can normally assign a new value), it will produce an error at the definition site
2. If you try to do something like `some_inline_value.field = value` it also produces an error at the call/assignment site
The actual messages could use some improvements, but that's not a super high priority for the time being as I'm more focusing on the language overall.
I don’t know what inspired borrow checking but I am certain someone else would have thought it up presently if they hadn’t.
I might be wrong here, so please feel free to correct me if so, but I don't think borrowing was a concept, per se, of the language itself.
As you mention, the concept the Rust designers took from Cyclone was explicit lifetimes.
Borrow checking provides two features (but in my opinion in a very un-ergonomic way): (1) prevention of use after free; and (2) temporal memory safety (i.e. guaranteeing no data races, but not eliminating race conditions in general).
I'm still wobbly on PLT legs though; I'm sure there's a pro or ten who could step in and elaborate.
https://homes.cs.washington.edu/~djg/papers/cyclone_memory.p...
or the more detailed discussion throughout this journal paper:
https://homes.cs.washington.edu/~djg/papers/cyclone_scp.pdf
As their citations indicate, the idea of borrowing appeared immediately in the application of subtructural logics to programming languages, back to Wadler's "Linear types can change the world!". It's just too painful without it.
I appreciate the follow up and references.
Now I've got some Friday night reading :)
[1]: http://mlton.org/Regions
Myself, I'd lean towards a sound (linear) type theory. If it's not too much trouble, insert the run-time checks in debug builds but use the type system to erase them for optimized builds. It might seem like the mountain is impossible to climb if you're not used to formalizing such systems but every mountain is surmounted one step at a time.
It's hard to bolt-on correctness after the fact. In my experience, for critical pieces like this, it's better to get the specification right first before digging in too deep and writing code.
Best of luck on the project either way you go. Memory safety is increasingly important!
It'd be interesting to see different theories evolve, for sure. Maybe something in separation logic will make it's way into mainstream type theory and into compilers at some point.
What's cool is how much we're starting to see folks push the boundaries these days. :)
Like, rustc only recently gained the ability to explain that the borrow checker rejected a program because a lifetime parameter was invariant. And this isn't even an artificial bit of complexity -- if you have mutability you are required to think about variance. If you have a Cell<&'static str> you cannot just turn that into a Cell<&'a str>, the way you can turn a regular &'static str into a &'a str. (Java programmers might be familiar with similar issues around ArrayList<Object>.)
in rust, &mut means "this pointer points to an initialized value, and will point to an initalized value at the end of scope"
i wish it also had &in, &out, and &tmp, for "initialized->uninitialized", "uninitialized->initialized", and "uninitialized->uninitialized"
But these typestate systems aren't very elaborate usually. I've recently been doing research on the (hypothetical) design of a language which has typestates that can cross function boundaries - you can call a function that annotates that it uninitializes one of its arguments, and then that reference you passed in is now considered uninitialized in your local scope.
> The rationale for this is that at some point, all garbage collected languages run into the same issue: the workload is too great for the garbage collector to keep up.
What this actually means is "The program I wrote is a poor fit for the garbage collector I am using." which can be fixed by either changing the program or changing the garbage collector. People often focus on the latter and forget the former is a possibility[1].
Similarly with single-ownership and borrows, you can write a program that is fighting the memory management system (see e.g. any article on writing a doubly-linked list in Rust).
In other structured memory allocation systems (hierarchical, pools &c.), the memory allocation lends itself well to certain architectures of code.
As far as I know, nobody has done a comparison of various memory management systems and how they enable and hinder various forms of program design.
1: The "time" special operator in SBCL shows allocation and GC time statistics by default. I think this nudges people into thinking about how their program is allocating at exactly the right time: when they are worried about how long something is taking.
Reference counting is garbage collection. I've been on this hill for years, and I am prepared to die here.