Okay, back onto the topic at hand:
I am very much in favor of strictly requiring telemitry and other user data to be sent only when the user opted in. This may occur via a popup on first installation asking whether data should be collected or by having to specify an API key in some settings file, which should be enough user action to determine that the user is indeed inclined to have his data shared to some remote. Of course, Package Control would be excluded from this with the privacy policy mentioned in OP.
This should be the first step.
Any package found violating this rule will be purged from the channel immediately, probably with a one strike system and re-adding it once when the violation is removed. Since there is no reporting system on the site (yet), I suggest contacting me or Will directly when you find a package like this. You can find us on the discord server, for example (just mention us/me in the #general
channel with a url to the offending package). Not that we will only review package on their initial submission and when specifically reported to us, as we can’t possibly review each and every update.
In a similar matter, packages doing other malicious things to the system (Python doesn’t run in a sandbox) should be removed immediately as well with all the author’s other packages purged as well. No strikes here.
Now, onto whether data collection should be banned entirely: I believe this is not possible. There are a couple packages that specifically send stuff to some remote as their designated feature, such as time-tracking packages like Wakatime or packages that upload snippets to pastebin sites. These would be affected by this policy, too, even though they are certainly useful for some. A privacy policy statement from these packagesabout what data is collected and how it is used would certainly be appreciated, though I don’t currently see how this can be implemented reasonably. For now, I say the readme is enough and we require this to be present (for new packages).
@braver suggested to also only permit user data being sent if it exists in the package from the beginning and was not added later. I don’t support this stance, because there may very well be scenarios where you add an additional feature that is related to your package but does not need to be in its own package for architectural or topical reasons. I don’t want to force too many restrictions on developers when there don’t need to be. The feature must be opt-in either way, so it doesn’t matter whether the user explicitly installs the “new separate package” with that feature or enables it manually through some action.
From a review standpoint, this doesn’t matter either, except that it would be easier on us as we don’t have to try and track down the state in which the package was submitted. In order to find out whether a package is violating policies, it has to reviewed either way.