Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Even if it’s under specified, surely it should at least leave it _compiling_?

Are you using Claude Code? Do yo have it configured so that you are not allowing it to run the build? Because I've observed that Claude Code is extremely good at making sure the code compiles, because it'll run a compile and address any compile errors as part of the work.

I just asked it to build a TOML example program in DotNet using Tomlyn, and when it was done I was able to run "./bin/Debug/net8.0/dotnettoml example.toml", it had already built it for me (I watched it run the build step as part of its work, as I mentioned it would do above).

 help



I am using Claude code. I didn’t explicitly tell it what the build command was (it’s dotnet build), and it didn’t ask. Thats not my fault.

> I’ve observed Claude code is extremely good at making sure the code compiles

My observation is that it’s fine until it’s absolutely not, and the agentic loop fails.


>Thats not my fault.

I don't know that it's useful to assign blame here.

It probably is to your benefit, if you are a coding professional, to understand why your results are so drastically different from what others are seeing. You started this thread saying "I keep getting told I'll be amazed at what it can do, but the tools keep failing at the first hurdle."

I'm telling you that something is wrong, that is why you are getting poor results. I don't know what is wrong, but I've given you an example prompt and an example output showing that Claude Code is able to produce the exact output you were looking for. This is why a lot of people are saying "you'll be amazed at what it can do", and it points to you having some issue.

I don't know if you are running an ancient version of Claude Code, if you are not using Opus 4.6, you are not using "high" effort (those are what I'm using to get the results I posted elsewhere in reply to your comment), but something is definitely wrong. Some of what may be wrong is that you don't have enough experience with the tooling, which I'd understand if you are getting poor results; you have little (immediate) incentive to get more proficient.

As I said, I was able to tell Claude Code to do something like the example you gave, and it did it and it built, without me asking, and produced a working program on the first try.


> I don’t know that it’s useful to assign blame here

Oh - I’m blaming Claude not anyone else. I’ve tried again this evening and the same prompt (in the same directory on the same project) worked.

> i don’t know if you’re using an ancient version of Claude code,

I’m on a version from some time last week, and using opus 4.6

> This is why a lot of people are saying "you'll be amazed at what it can do", and it points to you having some issue.

If you look at my comments in these threads, I’ve had these issues and been posting about this for months. I’m still being told “ you’re using the wrong model or the wrong tool or you’re holding it wrong” but yet, here I am.

I’m using plan mode, clearly breaking down tasks and this happens to me basically every time I use the damn tool. Speaking to my team at work and friends in other workplaces, I hear the same thing. But yet we’re just using it wrong or doing something wrong,

Honestly, I genuinely think the people who are not having these experiences just… don’t notice that they are.


I understand that you think you are arguing that the models are bad, but the only thing people wonder is what you're doing to fail so spectacularly and whether you're actually being truthful.

And I’m wondering the same thing. The people replying to me are saying “your experience must be wrong/you must be doing something wrong” and then 1-2 threads later they say “well yeah it doesn’t do X but if I do Y and Z it works”, which… kind of proves the point?

I assure you I would have noticed if the result of my Friday effort was something that didn't compile, rather than a service that seems to work just fine. I've reviewed about half the code so far and it seems quite reasonable.

For some reason I'm getting downvoted for trying to help, but regardless, if you can (and want to) post a transcript somewhere of some of these sessions that aren't working out, maybe some of us who are having more success can take a look and see what's up.


https://news.ycombinator.com/item?id=47487638

Fresh prompt in a codebase with a claude.md I asked it to do a simple task. I've shared the prompt, plan, output in the linked gist.


OK, that was an extremely short interaction -- which may be the problem. I left a comment with suggestions.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: