I wonder what the thought process of the Go designers was when coming up with that approach. Function scope is rarely what a user needs, has major pitfalls, and is more complex to implement in the compiler (need to append to an unbounded list).
> I wonder what the thought process of the Go designers was when coming up with that approach.
Sometimes we need block scoped cleanup, other times we need the function one.
You can turn the function scoped defer into a block scoped defer in a function literal.
AFAICT, you cannot turn a block scoped defer into the function one.
So I think the choice was obvious - go with the more general(izable) variant. Picking the alternative, which can do only half of the job, would be IMO a mistake.
I hate even more that you can call defer in a loop, and it will appear to work, as long as the loop has relatively few iterations, and is just silently massively wasteful.
I know. Or in some cases, you can put the loop body in a dedicated function. There are workarounds. It's just bad that the wrong way a) is the most obvious way, and b) is silently wrong in such a way that it appears to work during testing, often becoming a problem only when confronted with real-world data, and often surfacing only as being a hard-to-debug performance or resource usage issue.
In a tight loop you'd want your cleanup to happen after the fact. And in, say, an IO loop, you're going to want concurrency anyway, which necessarily introduces new function scope.
> In a tight loop you'd want your cleanup to happen after the fact.
Why? Doing 10 000 iterations where each iteration allocates and operates a resource, then later going through and freeing those 10 000 resources, is not better than doing 10 000 iterations where each iteration allocates a resource, operates on it, and frees it. You just waste more resources.
> And in, say, an IO loop, you're going to want concurrency anyway
This is not necessarily true; not everything is so performance sensitive that you want to add the significant complexity of doing it async. Often, a simple loop where each iteration opens a file, reads stuff from it, and closes it, is more than good enough.
Say you have a folder with a bunch of data files you need to work on. Maybe the work you do per file is significant and easily parallelizable; you would probably want to iterate through the files one by one and process each file with all your cores. There are even situations where the output of working on one file becomes part of the input for work on the next file.
Anyway, I will concede that all of this is sort of an edge case which doesn't come up that often. But why should the obvious way be the wrong way? Block-scoped defer is the most obvious solution since variable lifetimes are naturally block-scoped; what's the argument for why it ought to be different?
It doesn't just have to be files, FWIW. I once worked in a Go project which used SDL through CGO for drawing. "Widgets" were basically functions which would allocate an SDL surface, draw to it using Cairo, and return it to Go code. That SDL surface would be wrapped in a Go wrapper with a Destroy method which would call SDL_DestroySurface.
And to draw a surface to the screen, you need to create an SDL texture from it. If that's all you want to do, you can then destroy the SDL surface.
So you could imagine code like this:
strings := []string{"Lorem", "ipsum", "dolor", "sit", "amet"}
stringTextures := []SDLTexture{}
for _, s := range strings {
surface := RenderTextToSurface(s)
defer surface.Destroy()
stringTextures = append(stringTextures, surface.CreateTexture())
}
Oops, you're now using way more memory than you need!
Why would you allocate/destroy memory in each iteration when you can reuse it to much greater effect? Aside from bad API design, but a language isn't there to paper over bad design decisions. A good language makes bad design decisions painful.
The surfaces are all of different size, so the code would have to be more complex, resizing some underlying buffer on demand. You'd have to split up the text rendering into an API to measure the text and an API to render the text, so that you could resize the buffer. So you'd introduce quite a lot of extra complexity.
And what would be the benefit? You save up to one malloc and free per string you want to render, but text rendering is so demanding it completely drowns out the cost of one allocation.
Why does the buffer need to be resized? Your malloc version allocates a fixed amount of memory on each iteration. You can allocate the same amount of memory ahead of time.
If you were dynamically changing the malloc allocation size on each iteration then you have a case for a growable buffer to do the same, but in that case you would already have all the complexity of which you speak as required to support a dynamically-sized malloc.
> The example allocates an SDL_Surface large enough to fit the text string each iteration.
Impossible without knowing how much to allocate, which you indicate would require adding a bunch of complexity. However, I am willing to chalk that up to being a typo. Given that we are now calculating how much to allocate on each iteration, where is the meaningful complexity? I see almost no difference between:
>> The example allocates an SDL_Surface large enough to fit the text string each iteration.
> Impossible without knowing how much to allocate
But we do know how much to allocate? The implementation of this example's RenderTextToSurface function would use SDL functions to measure the text, then allocate an SDL_Surface large enough, then draw to that surface.
> I see almost no difference between: (code example) and (code example)
What? Those two code examples aren't even in the same language as the code I showed.
The difference would be between the example I gave earlier:
stringTextures := []SDLTexture{}
for _, str := range strings {
surface := RenderTextToSurface(str)
defer surface.Destroy()
stringTextures = append(stringTextures, surface.CreateTexture())
}
> Remember, I'm talking about the API to a Go wrapper around SDL.
We were talking about using malloc/free vs. a resizable buffer. Happy to progress the discussion towards a Go API, however. That, obviously, is going to look something more like this:
renderer := SDLRenderer()
defer renderer.Destroy()
for _, str := range strings {
surface := renderer.RenderTextToSurface(str)
textures = append(textures, renderer.CreateTextureFromSurface(surface))
}
I have no idea why you think it would look like that monstrosity you came up with.
> No. We were talking about using malloc/free vs. a resizable buffer.
No. This is a conversation about Go. My example[1], that you responded to, was an example taken from a real-world project I've worked on which uses Go wrappers around SDL functions to render text. Nowhere did I mention malloc or free, you brought those up.
The code you gave this time is literally my first example (again, [1]), which allocates a new surface every time, except that you forgot to destroy the surface. Good job.
I invite you to read the code again. You missed a few things. Notably it uses a shared memory buffer, as discussed, and does free it upon defer being executed. It is essentially equivalent to the second C snippet above, while your original example is essentially equivalent to the first C snippet.
Wait, so your wrapper around SDL_Renderer now also inexplicably contains a scratch buffer? I guess that explains why you put RenderTextToSurface on your SDL_Renderer wrapper, but ... that's some really weird API design. Why does the SDL_Renderer wrapper know how to use SDL_TTF or PangoCairo to draw text to a surface? Why does SDL_Renderer then own the resulting surface?
To anyone used to SDL, your proposed API is extremely surprising.
It would've made your point clearer if you'd explained this coupling between SDL_Renderer and text rendering in your original post.
But yes, I concede that if there was any reason to do so, putting a scratch surface into your SDL_Renderer that you can auto-resize and render text to would be a solution that makes for slightly nicer API design. Your SDL_Renderer now needs to be passed around as a parameter to stuff which only ought to need to concern itself with CPU rendering, and you now need to deal with mutexes if you have multiple goroutines rendering text, but those would've been alright trade-offs -- again, if there was a reason to do so. But there's not; the allocation is fast and the text rendering is slow.
You're right to call out that the SDLRenderer name was a poor choice. SDL is an implementation detail that should be completely hidden from the user of the API. That it may or may not use SDL under the hood is irrelevant to the user of the API. If the user wanted to use SDL, they would do so directly. The whole point of this kind of abstraction, of course, is to decouple of the dependence on something like SDL. Point taken.
Aside from my failure in dealing with the hardest problem in computer science, how would you improve the intent of the API? It is clearly improved over the original version, but we would do well to iterate towards something even better.
Some hypothetical example numbers: if software-rendering text takes 0.1 milliseconds, and I have a handful of text strings to render, I may not care that rendering the strings takes a millisecond or two.
But that 0.1 millisecond to render a string is an eternity compared to the time it takes to allocate some memory, which might be on the order of single digit microseconds. Saving a microsecond from a process which takes 0.1 milliseconds isn't noticeable.
You might not care today, but the next guy tasked to render many millions of strings tomorrow does care. If he has to build yet another API that ultimately does the same thing and is almost exactly the same, something has gone wrong. A good API is accommodating to users of all kinds.
It might be preferable to create a font atlas and just allocate printable ASCII characters as a spritesheet (a single SDL_Texture* reference and an array of rects.) Rather than allocating a texture for each string, you just iterate the string and blit the characters, no new allocations necessary.
If you need something more complex, with kerning and the like, the current version of SDL_TTF can create font atlases for various backends.
Completely depends on context. If you're rendering dynamically changing text, you should do as you say. If you have some completely static text, there's really nothing wrong with doing the text rendering once using PangoCairo and then re-using that texture. Doing it with PangoCairo also lets you do other fancy things like drop shadows easier.
Opening a file is fairly fast (at least if you're on Linux; Windows not so much). Synchronous code is simpler than concurrent code. If processing files sequentially is fast enough, for what reason would you want to open them concurrently?
For concurrent processing you'd probably do something like splitting the file names into several batches and process those batches sequentially in each goroutine, so it's very much possible that you'd have an exact same loop for the concurrent scenario.
P.S. If you have enough files you don't want to try to open them all at once — Go will start creating more and more threads to handle the "blocked" syscalls (open(2) in this case), and you can run out of 10,000 threads too
You'd probably have to be doing something pretty unusual to not use a worker queue. Your "P.S." point being a perfect case in point as to why.
If you have a legitimate reason for doing something unusual, it is fine to have to use the tools unusually. It serves as a useful reminder that you are purposefully doing something unusual rather than simply making a bad design choice. A good language makes bad design decisions painful.
You have now transformed the easy problem of "iterate through some files" into the much more complex problem of either finding a work queue library or writing your own work queue library; and you're baking in the assumption that the only reasonable way to use that work queue is to make each work item exactly one file.
What you propose is not a bad solution, but don't come here and pretend it's the only reasonable solution for almost all situations. It's not. Sometimes, you want each work item to be a list of files, if processing one file is fast enough for synchronisation overhead to be significant. Often, you don't have to care so much about the wall clock time your loop takes and it's fast enough to just do sequentially. Sometimes, you're implementing a non-important background task where you intentionally want to only bother one core. None of these are super unusual situations.
It is telling that you keep insisting that any solution that's not a one-file-per-work-item work queue is super strange and should be punished by the language's design, when you haven't even responded to my core argument that: sometimes sequential is fast enough.
Your comment was in reply to nasretdinov, but its fundamental logic ignores what I've been telling you this whole time. You're pretending that the only solution to iterating through files is a work queue and that any solution that does a synchronous open/close for each iteration is fundamentally bad. I have told you why it isn't: you don't always need the performance.
for _, filename := range files {
queue <- func() {
f, _ := os.Open(filename)
defer f.Close()
}
}
or more realistically,
var group errgroup.Group
group.SetLimit(10)
for _, filename := range files {
group.Go(func() error {
f, err := os.Open(filename)
if err != nil {
return fmt.Errorf("failed to open file %s: %w", filename, err)
}
defer f.Close()
// ...
return nil
})
}
if err := group.Wait(); err != nil {
return fmt.Errorf("failed to process files: %w", err)
}
Perhaps you can elaborate?
I did read your code, but it is not clear where the worker queue is. It looks like it ranges over (presumably) a channel of filenames, which is not meaningfully different than ranging over a slice of filenames. That is the original, non-concurrent solution, more or less.
// Spawn workers
for _ := range 10 {
go func() {
for path := range workQueue {
fp, err := os.Open(path)
if err != nil { ... }
defer fp.Close()
// do work
}
}()
}
// Iterate files and give work to workers
for _, path := range paths {
workQueue <- path
}
Maybe, but why would one introduce coupling between the worker queue and the work being done? That is a poor design.
Now we know why it was painful. What is interesting here is that the pain wasn't noticed as a signal that the design was off. I wonder why?
We should dive into that topic. I suspect at the heart of it lies why there is so much general dislike for Go as a language, with it being far less forgiving to poor choices than a lot of other popular languages.
I think your issue is that you're an architecture astronaut. This is not a compliment. It's okay for things to just do the thing they're meant to do and not be super duper generic and extensible.
It is perfectly okay inside of a package. Once you introduce exports, like as seen in another thread, then there is good reason to think more carefully about how users are going to use it. Pulling the rug out from underneath them later when you discover your original API was ill-conceived is not good citizenry.
But one does still have to be mindful if they want to write software productively. Using a "super duper generic and extensible" solution means that things like error propagation is already solved for you. Your code, on the other hand, is going to quickly become a mess once you start adding all that extra machinery. It didn't go unnoticed that you conveniently left that out.
Maybe that no longer matters with LLMs, when you don't even have to look the code and producing it is effectively free, but LLMs these days also understand how defer works so then this whole thing becomes moot.