Sublime Forum

RFC: Default Package Control Channel and Package Telemetry

#25

Oh, so it’s fine then :roll_eyes:. You needed to mine some data without people knowing, you got what you needed and now that you’re caught you just have to try to make it to the exit swiftly and silently before it blows up any further. Bravo.

You do understand that this discussion isn’t about this particular situation, but about preventing it from ever happening again.

An option for “Allow plugin authors to collect anonymous usage statistics” similar to Atom’s could be helpful.

A blanket opt-in is useless. Users should know exactly which package requests telemetry and should be able to opt-in on a per package basis.

9 Likes

#26

Plugins should make very clear that they are collecting information and be very clear about what they are collecting. I think it should never default to collecting without consent. It should also be per plugin. Just because I give consent to one, I may not want to give consent to all.

Spoiler alert, I don’t want to give consent to any plugin. I frankly don’t trust plugins to this kind of information.

6 Likes

#27

The question is, why did you try to hide who the data was being sent to? And why did you ask to capture activeNonBundledPackageNames? That bit of data seems like a very non-anonymous collection of information. You could be capturing internal package names and consequently exfiltrating the existence of development of competitors products.

I’m having a really hard time understanding why you’ve been so careful about being anonymous about capturing data from the users of (at least) two popular text editors. Do you have you fingers in VS Code, VIM or Emacs packages also?

11 Likes

#28

With the realization of the massive privacy leak that SideBarEnhancements has been perpetrating (sending a list of every installed package to a commercial entity that has been hiding its involvement), I’ve made a judgement to remove it in an attempt at preventing further damage. It can be added back if/when we see a concern for the privacy of users.

10 Likes

#29

Also, why does it take exposure via a thread like this for you to take action? You say you’re staying open, but why didn’t you pull your involvement from SidebarEnhancements when you got caught on the Atom packages last week?

It’s plainly obvious you fail to understand the extent to which you abused and continue to abuse your target audience. So please don’t try to spin this or “engage with the community”, it’s insulting.

2 Likes

#30

@braver the truth is we didn’t remember. This was done the better part of a year ago and we haven’t looked at it in a while.

(We also didn’t remember this telemetry was in the kite-installer package linked to in OP. That will be removed later today.)

@wbond I believe we did not capture internal package names, or that we filtered them out in the code.

0 Likes

#31

I don’t see any filtering happening here: https://github.com/SideBarEnhancements-org/SideBarEnhancements/blob/1858330b71b682ec1f3265bd280dde65465bdf18/Stats.py#L86.

1 Like

#32

Incompetent much?

Edit: shouldn’t post when angry.

0 Likes

#33

@wbond I have some internal packages installed manually into ST3 but when I run sublime.load_settings('Package Control.sublime-settings').get('installed_packages') I don’t see them listed. (?)

0 Likes

#34

It depends on how you install them. If you git cloned or manually installed, they won’t be in the installed list. If you installed from a private channel file, or by pasting the git repo as a Package Control repository and used Package Control to install/update then they will be in the list.

If you wanted info about popularity of packages. https://packagecontrol.io/browse/popular.json is probably what you should have used.

2 Likes

#35

Actually, the same concern is also applicable to commercial packages and Package Control binary dependencies. A package developer can ship their packages with binary dependencies which somehow collect user information.

One of my packages requires PyWIn32, and it seems that there is no good way to prove that the binary files are from a credible source.

0 Likes

#36

You make a good point. It seems like something along the lines of Little Snitch would be required to tell if binary packages are making web requests.

Honestly, that’s probably the only feasible way to tell anyway, since it won’t be possible to review the source of every package.

I think we’ll probably have to have a rule and consequences for breaking the rule, but rely on the community for identification.

8 Likes

#37

I’d be interested to see something like the manifest.json format in Chrome Applications.

It’d be hard to limit actual behavior, but as different permissions are brought up like “collect file data” or “collect other package data” packages can explicitly tell the user what they are doing.

Ideally you’d also be able to deny new permissions from this manifest and those particular features are turned off.

If a package doesn’t list tracking/network traffic or it ignores the user’s choice when asking for permissions, I see that as a 1 strike offense. The package is removed and any other packages associated with that author and/or corporation are reviewed.

This will be harder since unlike iOS, Chrome, or Android ST and package control don’t have as much of a sandbox around plugins.
But, keeping things to an honor system with clear consequences I think should keep the channel fairly clear.

0 Likes

#38

@wbond

This would not prevent authors from creating their own PC channel

Just so I understand correctly, this means something like SublimeLinter’s separate channel for all its plugins? If so, this would be a huge cut into how we currently operate our reviewal process, and I’m sure would increase the burden on Package Control maintainers to work around our approval system.

To make a statement about this issue: SublimeLinter cares about your privacy and does not tolerate data collection as shown by Kite or other parties. We do not, and will never, give away or sell your information without prior approval as long as I am involved. That being said, our contributor plugins are not fully under the SublimeLinter Community jurisdiction and we can not promise that same level of service. We will work with users to help remedy the issue if one of our contributors is found collecting information.

4 Likes

#39

Just to be clear on the SublimeLinter topic: the sublime linter “channel” is actually a “repository” and included in the default channel. This is mostly a relic from old times and, as far as I’m aware, the different types of specifying packages could be unified (and the terminology is kind of confusing), but it is not of matter for this topic. I believe that I created an issue about this on the PC repo a while ago.

I will comment on the actual topic of this thread later with more time at hand.

2 Likes

#40

Okay, back onto the topic at hand:

I am very much in favor of strictly requiring telemitry and other user data to be sent only when the user opted in. This may occur via a popup on first installation asking whether data should be collected or by having to specify an API key in some settings file, which should be enough user action to determine that the user is indeed inclined to have his data shared to some remote. Of course, Package Control would be excluded from this with the privacy policy mentioned in OP.
This should be the first step.

Any package found violating this rule will be purged from the channel immediately, probably with a one strike system and re-adding it once when the violation is removed. Since there is no reporting system on the site (yet), I suggest contacting me or Will directly when you find a package like this. You can find us on the discord server, for example (just mention us/me in the #general channel with a url to the offending package). Not that we will only review package on their initial submission and when specifically reported to us, as we can’t possibly review each and every update.

In a similar matter, packages doing other malicious things to the system (Python doesn’t run in a sandbox) should be removed immediately as well with all the author’s other packages purged as well. No strikes here.


Now, onto whether data collection should be banned entirely: I believe this is not possible. There are a couple packages that specifically send stuff to some remote as their designated feature, such as time-tracking packages like Wakatime or packages that upload snippets to pastebin sites. These would be affected by this policy, too, even though they are certainly useful for some. A privacy policy statement from these packagesabout what data is collected and how it is used would certainly be appreciated, though I don’t currently see how this can be implemented reasonably. For now, I say the readme is enough and we require this to be present (for new packages).


@braver suggested to also only permit user data being sent if it exists in the package from the beginning and was not added later. I don’t support this stance, because there may very well be scenarios where you add an additional feature that is related to your package but does not need to be in its own package for architectural or topical reasons. I don’t want to force too many restrictions on developers when there don’t need to be. The feature must be opt-in either way, so it doesn’t matter whether the user explicitly installs the “new separate package” with that feature or enables it manually through some action.

From a review standpoint, this doesn’t matter either, except that it would be easier on us as we don’t have to try and track down the state in which the package was submitted. In order to find out whether a package is violating policies, it has to reviewed either way.

7 Likes

#41

Fair enough and plenty good arguments all around.

Besides these recent transgressions related to Kite, the threat level isn’t even that severe. So, as long as there is an agreement and a way to act on violations, we should be good. Requirements can always be tightened if the need arises.

2 Likes

#42

Here is a very practical question from a ST3 and SidebarEnhancements user:

  1. Can I just disable to SidebarEnhancements to avoid further data collection?
  2. SidebarEnhancements is very useful and I reply on the functionality pretty much every day. Is there an easy way to keep using SidebarEnhancements without the data collection in the background? E.g. can I block it or would it be possible to republish SidebarEnhancements without the data collection if the license allows it?
0 Likes

#43

Yes, disabling the package should stop the code from executing at all.

SidebarEnhancements is very useful and I reply on the functionality pretty much every day. Is there an easy way to keep using SidebarEnhancements without the data collection in the background? E.g. can I block it or would it be possible to republish SidebarEnhancements without the data collection if the license allows it?

There are options:

0 Likes

#44

I’ve just re-added SideBarEnhancements to the default channel since Tito removed the stats code and cut a new release. The package should be back in the channel shortly once the crawler picks it up, after that everyone should be updated to the new version the next time their install checks for updates.

2 Likes