Coding in the open: how to do it securely

jamiegreen · on Sept 27, 2017

I know most of the comments here are about the use of GA, but I just want to comment on the main point to say I think it's pretty great they are going with an open source first attitude. I work for a big intergovernmental org, and I can say that most/all our projects are very much closed source. :/

sordidasset · on Sept 27, 2017

the tumbler-padlock analogy is broken. I know how to pick a lock mostly because the "how it works" info is open.

jononor · on Sept 27, 2017

Do you now have a better or worse understanding over what kind of locks are not secure?

balderdash · on Sept 27, 2017

+1

And importantly, when to use a deadbolt, or deadbolt + steel frame door, or a safe. All of which are well understood but can be implemented depending on when circumstances warrant. i.e. tool shed vs bank vault.

Shish2k · on Sept 27, 2017

Did you get that information from the lock company publishing specs on their website, or from people reverse engineering it?

javajosh · on Sept 27, 2017

Some locks with open designs are difficult, or impossible to pick.

In this amusing video this guy goes over details of different locks, and explains why they are not all created equal.

https://www.youtube.com/watch?v=nsJZ_kKjXcE

Nursie · on Sept 27, 2017

Oh look, more gov.uk pages that pull in Google Analytics.

Because it's certainly not a privacy problem to have every interaction between a citizen and their government reported to an overseas megacorp.

But sure, let's talk about security.

robin_reala · on Sept 27, 2017

I try to answer this every time the question is asked; my last attempt seemed to be satisfactory to most people so easiest to just link to it: https://news.ycombinator.com/item?id=15070904

Nursie · on Sept 27, 2017

And I try to bring it up every time gov.uk is mentioned.

Safe harbour provisions and contractual agreements of this sort are effectively worthless when it comes to crossing borders, particularly where the US is involved.

Using adblockers to turn it off is not acceptable, and won't protect the majority of less tech-savvy folks.

If you can use Piwik for things you think must be be more secure then that tells me two things -

1. It's possible for you to use Piwik

2. You don't believe that sending sensitive data to google is always a good idea either. You just don't think that most government-citizen interactions are sufficiently sensitive for some reason.

robin_reala · on Sept 27, 2017

Sure, it’s possible to run your own analytics achitecture but at a 100m visits a month is it practical? Point two of the GOV.UK design principles is to do less: https://www.gov.uk/design-principles#second . When GOV.UK first started it definitely had to hit the ground running which requires immediate evaluation of user data ASAP. Certainly at the time there wasn’t enough dev hours available given a limited number of people to evaluate, build and run an analytics framework of the scale necessary.

Having said that, then isn‘t now, and I wouldn’t be surprised if a future Government as a Platform service isn’t a cross-government analytics system hosted from the UK. I just don’t see the pressing need for it, based on the assumption that Google actually are anonymising the data. If you want to disagree with that assumption then that’s a valid viewpoint too, but I see no evidence for it.

vog · on Sept 27, 2017

> based on the assumption that Google actually are anonymising the data. If you want to disagree with that assumption then that’s a valid viewpoint too, but I see no evidence for it.

This is exactly the wrong way around. For every sensitive area, such as privacy, it is upon the company to prove proper handling of data. But if that company is outside your legislation, without any legal means in their country, how could that ever be possible?

Taking just their word is like trusting the food industry with hygienics until their customers become undeniably sick.

A better strategy is not to create sensitive datasets in the first place. In Germany, this principle is called "Datensparsamkeit", which could be translated to "data frugality".

Moreover, every country should have something like the FDA for data hygienics. Unfortunately, even in Germany where we do have "Datenschutzbeauftragte" (designees for data protection), those can make a lot of noise but don't have much power. This is still better than not having those people, though.

robin_reala · on Sept 27, 2017

I’m not disagreeing with you outright, but your argument is on the ad infinitum scale. For instance, GOV.UK PaaS uses AWS as a host[1]. Is that worrying? Should we not be using Cisco gear because of US govt backdoors? Should we not be using Chinese manufactured chips? These considerations are ones the military has daily, but they hinder the ability to deliver. Analytics is arguable both ways (and I lean towards your argument at this point in time) but there are good reasons past and present for GA.

[1] https://www.cloud.service.gov.uk

Nursie · on Sept 27, 2017

These aren't really equivalent to actively sending out data to overseas entities by anyone using your pages.

You absolutely should be taking precautions to make sure that what you're doing on AWS is secure. And in fact erring on the side of using providers who host in European countries, preferably European organisations.

(addendum - I spent a lot of the early part of this year working on AWS-based data processing systems for a large bank, they took massive precautions with the transport and storage of their data within the AWS system, including IPSec overlays, 14 day maximum node lifetimes and various other things. I realise that at some point you're trusting amazon, but there's a lot can be done to avoid having problems in the first place. "Not sending data to places you don't absolutely have to" seems pretty basic)

Symbiote · on Sept 27, 2017

Those are all maybes.

Google Analytics is a certainty, and reasonable alternatives are available.

I think it shows a terrible attitude from the department responsible to use GA. Not everything should be done by the lowest bidder, regardless of the costs. Maybe I'll write to my MP...

IanCal · on Sept 27, 2017

It's also possible to not run analytics on your site if you don't want to spend the effort running something secure within your own country.

switch007 · on Sept 27, 2017

> Point two of the GOV.UK design principles is to do less: https://www.gov.uk/design-principles#second .

Or, in full: "Government should only do what only government can do. If we’ve found a way of doing something that works, we should make it reusable and shareable instead of reinventing the wheel every time. This means building platforms and registers others can build upon, providing resources (like APIs) that others can use, and linking to the work of others. We should concentrate on the irreducible core."

Slightly different than the title.

Can't do privacy because the principle says "Do Less"? Why do anything?

Nursie · on Sept 27, 2017

>> Sure, it’s possible to run your own analytics achitecture but at a 100m visits a month is it practical?

That's surely secondary to privacy and security?

>> If you want to disagree with that assumption then that’s a valid viewpoint too, but I see no evidence for it.

I see a whole litany of problems with your assumption, based on the state of the world in terms of large-scale leaks and hacks, on the laws in the US which are much weaker than our own protections, and on the actions of various US agencies when they desire access to privately held information.

I see no reason to believe that anonymisation of the date somewhere within google infrastructure is an adequate protection, whether it actually happens as contracted or not.

Look, it's clear that you don't consider this important enough to do something about. Some of us do, and some of us would rather that if you can't do this with private analytics, you don't do it at all. And if that makes delivering your service harder, slower and more expensive - so be it. What's happening here is not right.

I mean, what you're saying when you reference point 2 of your design principles, is that it's alright to throw user privacy under a bus, so long as you can deliver the software effectively. This isn't a good justification.

robin_reala · on Sept 27, 2017

I don’t actually work for GDS anymore (or even live in the UK) but people there are receptive. Can you leave a comment against the blog post? Unfortunately I don’t know the best way to raise your concerns officially, but that would hopefully get an answer.

Actually thinking about it, the best way would be to apply for GDS’s recently opened Deputy Director of Technology and Operations role :) https://www.civilservicejobs.service.gov.uk/csr/jobs.cgi?jco...

konschubert · on Sept 27, 2017

> Point two of the GOV.UK design principles is to do less:

Maybe this particular case is an example where the principle of a lean government fails?

icebraining · on Sept 27, 2017

Nope, just not taken far enough: don't run analytics at all.

robin_reala · on Sept 27, 2017

Then you fail design principle three: Design with Data. https://www.gov.uk/design-principles#third

Nursie · on Sept 27, 2017

Failing a design principle is surely less important than security of citizen/government interactions?

mathieuh · on Sept 27, 2017

I have worked and continue to work on several Gov.UK projects and we take steps to ensure that anything that gets sent to GA is as anonymised as possible. Just last week we had a ticket in planning to further improve the anonymity of the data sent to GA for the UK MOT testing service, so it is definitely not like we’re just firing all sorts of data to third parties, we are thinking very hard about this stuff.

Preserving our users’ anonymity is one of our major priorities and we actively work to improve how we do this.

Nursie · on Sept 27, 2017

Then why is GA still there at all?

KGIII · on Sept 27, 2017

Dumb question; is it really all that important that they analyze traffic on a .gov address? If so, for what purpose(s)?

Kalium · on Sept 27, 2017

.gov organizations, like your boring standard-issue commercial ones, also want to know who is using their site and how. It helps inform decision-making about changes.

KGIII · on Sept 27, 2017

Thank you.

If you don't mind another dumb question, is the benefit great enough for the hassle of complaints and the possible compromise of ostensibly private information?

There are self-hosted solutions, but I imagine they'd entail experts to interpret and implement. I wonder if they could just do A/B testing and get the same results or if they could simply do surveys? Though, I suppose those come with new faults, variables, and expenses.

Again, thanks for the answer.

Kalium · on Sept 27, 2017

Generally,yes. It's also good for stuff like knowing what browsers are used and thus what technologies can be used.

The hassle of complaints and the possible compromise of "private" information (like what your browser sends) isn't that significant. The pain from complaints is trivial, largely confined to Hacker News threads of no significance whatsoever. The ostensibly private data is protected contractually and legally.

I'm not aware of any approach that both yields useful information and eliminates any possible risk of compromise. Surveys and A/B testing and self-hosted solutions all have the same problems and risks, and generally extra costs to boot.

In short, the cost-benefit analysis isn't all that different from that of a company considering an analytics tool.

KGIII · on Sept 27, 2017

Again, much thanks. No more questions, for now. Just a sincere thanks. I don't actually read HN for the articles. I do read many of them, but I'm here for the informative answers, insightful responses, and chances to get an education in subjects I'd not normally consider.

Once in a while, chaos theory or traffic modeling come up. So, I get to give back. ;-)

Again, my sincere thanks.

Kalium · on Sept 27, 2017

Happy to help!

fghtr · on Sept 27, 2017

I see no answer there, just a belief that NSA and government will not break the laws, which is doubtful.

msla · on Sept 27, 2017

So I'll link to my response: https://news.ycombinator.com/item?id=15073954

And I'll add another thing: This contract is between a government agency's bureaucrats and a private corporation, but it affects unknown numbers of private citizens who, really, have no choice but to have dealings with their government. Claiming that contract is sufficient is laughable, even if Google didn't have every reason in the world to disregard it and simply pay the fine associated with disregarding it.

musage · on Sept 28, 2017

Satisfactory? Saying "if you don’t trust them then that’s fine" and ignoring anyone who disagrees that that's fine isn't giving an answer -- including the poster you just replied to with this link. I don't believe it's exactly easy to achieve and maintain such a lack of self-awareness.

vincnetas · on Sept 27, 2017

This was brought up before when some other gov.uk site was discussed. One of commenters mentioned that gov.uk has strict license agreement which prevents google from accessing analytics data. Do not have sources for that, but that would make sense.

Nursie · on Sept 27, 2017

This is the line I've heard from them for about 5 years now, I'm not convinced it's anything but a fig-leaf.

zeveb · on Sept 27, 2017

> One of commenters mentioned that gov.uk has strict license agreement which prevents google from accessing analytics data.

The problem is that a license agreement prevents nothing; it may establish consequences for an action, should it occur and should it be detected and should the consequences be enforced, but it is unable to prevent that action from occurring.

That's the difference between 'can' and 'may.' A license agreement can't prevent something; a technical measure (e.g., not using Google Analytics or using some form of privacy-preserving analytics protocol) can.

If there's anything that human history teaches us, it's that Murphy's Law (that whatever can go wrong, will) holds: if it's possible for someone to do something one would rather he not, he probably will, sooner or later.

ex3xu · on Sept 27, 2017

Sarcasm aside, I think this is an important point, and I would like to know if you or anyone here has suggestions for plug-and-play analytics solutions to look into for those of us who would prefer to avoid Google Analytics?

antsar · on Sept 27, 2017

Piwik[0] is the most "plug-and-play" alternative I'm familiar with.

[0] https://piwik.org/

Nursie · on Sept 27, 2017

GDS and Gov.uk already use Piwik for some of their pages which they consider "extra sensitive".

From what I know of Piwik it's run on your own servers, so the traffic doesn't go elsewhere.

I cannot claim to have worked with either one though.

rinkler · on Sept 27, 2017

I think judiciary gov.uk cookies usage information [1] provides good detail on their usage of GA. In addition it provides a link to a Google developed Browser plug-in to 'opt you out' [2]

[1] https://www.judiciary.gov.uk/cookies/ [2] https://tools.google.com/dlpage/gaoptout

flukus · on Sept 28, 2017

Only google analytics? I tried to watch an online news stream from our national broadcaster yesterday (http://www.abc.net.au/news/newschannel/) with noscript turned on and found it it had about 8 different trackers installed and refused to work unless they were enabled.

Quarrelsome · on Sept 27, 2017

not really contributing to the discussion mate. Its analytics, its a good product, people use it because its convenient. What bearing does it have on the words in the article?

Ensorceled · on Sept 27, 2017

They are implying that it's a massive security and privacy hole and they think taking security advice from somebody "who has a key under the doormat" is problematic.

Not that I necessarily agree with the original comment but that's the reasoning.

Quarrelsome · on Sept 27, 2017

could I not make the same argument that using English is just making the NSA's job easier? I mean, they do tap the wires don't they making much of this moot.

jhardy54 · on Sept 27, 2017

You could make that argument, but I don't think it's either cohesive or convincing.

Nursie · on Sept 27, 2017

Yup,and if you have a news website or a shop, you go ahead. I specifically object to my own government sending data about citizen interactions with that government to an overseas third party.

fghtr · on Sept 27, 2017

Btw, please sign a related petition if you didn't already: https://publiccode.eu.

dest · on Sept 27, 2017

OK with public code, but with limits, some data and code should be closed [1], including:

    keys and credentials
    algorithms used to detect fraud
    unreleased policy

[1] https://www.gov.uk/government/publications/open-source-guida...

dangerface · on Sept 27, 2017

I get the others but why "unreleased policy"? The public will find out what the policy is so whats the point?

Ensorceled · on Sept 27, 2017

It's essentially insider trading to know policy before the policy is released.

In Canada, Tax Policy Changes take effect from the day the bill is introduced to Parliment or published for discussion for exactly this reason.

panarky · on Sept 27, 2017

It's not insider trading if the information is public.

Nomentatus · on Sept 27, 2017

Yes, it can be, and I believe convictions show this - if public means "someone else could have stumbled on the info, in theory" but you were specifically tipped off where to go and when in exchange for cash; which is the kind of worry here.

But as well, policy discussions that allow no privacy tend to be very circumscribed.

PeterisP · on Sept 27, 2017

For tax policy, it would make sense to solve this problem in the entirely opposite way - i.e. declare that any tax policy changes cannot take effect faster than the next fiscal year; it's also fair and much more reasonable.

hinkley · on Sept 27, 2017

Let's say they decide to put a tarrif on something a lot of people hate or think problematic. Let's say for argument's sake, 2 stroke generators. But they have a lot of power (and users) so you don't think you can swing a yearly tax. But a tax on new purchases/builds might be tractable. They get to keep all the existing ones but they can't build more. That's grandfathering.

If you give the wrong people to the end of the year, they'll grandfather in as many as they can, even if they sit unused. So now you're putting off curbing the installed base and instead you've actually created a glut. Now your tax will have zero net effect for four or five years and will make things (say, smog) far worse in the short term.

SomeStupidPoint · on Sept 27, 2017

It can actually end up unwinding your tax (or regulation) --

If the people rushing to buy up the supply to simply store them off-line happen to cut off people who need them for what are sensible and essential reasons, you end up with a litany of cases of the tax causing problems paraded about as it comes into effect, and it ends up repealed or undercut because of political pressure.

(Usually 1-year window extended because "we need more time to phase it in" and by the time the second window would close, it's gutted.)

It's not just that it can end up counter productive in the short term, it's that it can undercut itself as a law as well by destabilizing a market that usually has low, but important volume.

KGIII · on Sept 27, 2017

In some places, laws already work sort of like that. Except for emergency bills, they can't take effect until something like 180 days after congress has had their final session.

It makes perfect sense, to me at least, for taxation regulations to be the same. Of course, they'd probably just default to calling all rule changes emergencies...

Sort of related: I love the idea of paying taxes. Mine are prepared and filed as soon as I can do so. I have a great time paying my taxes, to the point where I bring a bottle of wine and flowers to my accountant. But, man, I really hate the complexity and how the money is spent.

Anyhow, maybe they should only make rules for the first half of the year and then none of the changes go into effect until the first day of the year? I like your idea.

blibble · on Sept 27, 2017

UK politicians would find that far too restrictive!

nearly every Budget affects duty on things like beer and petrol, and go into effect at midnight

people know to fill up their car the night before!

hardlianotion · on Sept 27, 2017

To prevent a chilling effect on in-depth analysis of policy options before they are finalised. I think this is a good thing.

rahkiin · on Sept 27, 2017

Maybe it is unfinished. To prevent a media backlash over non-final policy.

swsieber · on Sept 27, 2017

A prime example of this would be the writing of the constitution of the U.S. It would have been dead in the water if the public was at those meetings.

asfdsfggtfd · on Sept 27, 2017

What is the advantage of obscuring the algorithms used to detect fraud? (I.e. is this not just security by obscurity?)

ReverseCold · on Sept 27, 2017

Fraud/spam detection algorithms have to be security by obscurity, or someone can just check if what they're doing trips the algorithm and then not do that.

fghtr · on Sept 27, 2017

I would compare such obscurity with secret laws that you are not allowed to break. It opens up a possibility for selective enforcement.

marcosdumay · on Sept 27, 2017

Fraud detection can be sagely done in obscurity, as long as any enforcement action is done on the open.

Filtering data to find who is breaking a law is different from the law itself and from penalizing people.

theoh · on Sept 27, 2017

If you use an opaque machine learning technique to detect fraud, for the purposes of probity some kind of "parallel construction" could be employed to give a public exlanation; it's surely important to explain and justify to the humans involved. The analogy is with the person of a particular ethnicity who is always getting pulled over by the cops. This must be made transparent.

marcosdumay · on Sept 27, 2017

You don't need "parallel construction". You need to inform it was detected by some opaque means, and that you proceeded to verify it with a transparent and legal procedure.

What you can't do is detect it by opaque means and then proceed to sanction people or even gather some too elevated investigative powers.

theoh · on Sept 27, 2017

The act of verification (of suspected fraud) is not costless to the various parties involved. So say I keep getting inconvenienced by this verification because of my behaviour ("was it really you who made this request?") then someone should be accountable for all those false positives. "Parallel construction" creates an accountability. I know it usually means something more sinister -- but just symtactically it's a good description of the requirement.

marcosdumay · on Sept 27, 2017

I do think we are trying to say the same thing. When I say that no unreasonable additional power should be gained by such detection, I intend that between a few things that the verification should not be costly to the investigated party (costs should be only on the investigators). Looks like you are naming that verification "parallel construction", and so, I do agree.

EGreg · on Sept 27, 2017

In the past I really worried about open source networked software, because however you slice it, it is missing a huge chunk of security by obscurity, making 0days very likely. Especially for new projects. All security vulnerabilities are out in the open for all to see. Sure, honeypots can help you learn about vulnerabilities, but it will take years to patch them all. In the meantime, everyone using your software is vulnerable.

Then I discovered blockchains. Here, the software is run by the network and does nothing persistent unless a majority of the nodes agree. That makes it much harder to corrupt the persistence layer. Blockchains are NOT just for achieving global consensus about a ledger. They can be per-stream-of-data. That's the approach we take at Qbix.

There are still many other vectors of attack besides corrupting the database. However, in Web apps, the real pernicious thing is corrupting the data. Everything else has already been secured by webserver makers and language runtime designers.

PS: Finally, you can corrupt things on the client level, eg making a client sign a transaction the user didn't authorize. But at least it is localized to the corrupted clients, and not the whole network.