Sublime Forum

Package -- Load PyPI Dependency

#1

Hello –

I would like to create a Sublime Text package that allows users to export plain-text documents (marked-up prose, not code) to .docx files. I am a web developer; I am not a Python developer, but am willing to learn / am currently learning…

Anyway, I would like to use the “python-docx” Python package in my Sublime Text package, and I have no idea if that’s possible, or how to do it…

https://python-docx.readthedocs.io/en/latest/user/install.html

There’s this SO post that makes it look like I maybe could import it from the raw files (if I include them inside the package, I think?)… but then “python-docx” itself has a dependency on a package called “lxml”, and I have no idea how far that particualr rabbit-hole goes, or how to resolve dependencies’ dependencies this way, etc…

I also read the Package Control dependencies documentation and didn’t understand it at all… I’ve used NodeJS / NPM before, so I feel like I should be able to get this, but I literally read this page (and looked through the example stuff it links to) and said “What?” out loud, then re-read it, and then that’s when I came here for help…

Also, if there’s an alternative way to accomplish generating .docx files from within a Sublime Text package I’m all ears… I just need things to be as dead-simple for authors (e.g. non-developers) as possible, so they can just install a package and be done, and not have to install and configure a bunch of other things on their machines…

Thanks all (so much!) in advance for any help you can provide.

0 Likes

#2

Generally in Python there’s a set of locations that are searched when you do an import of a module, which is controlled by the sys.path; it contains some paths that Python inherently knows about and can be augmented via the PYTHONPATH or manipulated at runtime by changing sys.path (this is somewhat of an oversimplification, but that’s the general idea).

Python modules that are installed thus usually end up in places where the interpreter knows about them, and you can augment that as needed via PYTHONPATH. You also have the ability to use a virtualenv, which works the way you might expect a nodejs app to work, with all of the modules you install being locally installed for a project.

The Python interpreter that Sublime embeds still has the idea of a list of places to look for modules you try to import, but it’s completely separate from any version of Python you may or may not have installed on your machine, and also doesn’t have a built in notion of a virutalenv. The list of places that are searched is more constrained, so in order to use external Python code you need to take extra steps (as you’ve seen).

To better explain how this all works, open the Sublime console with View > Show Console and enter the following Python lines:

import sys
sys.path

That prints the list of locations where the import mechanism in the plugin host will look for things to load. What you see here depends on things like where you installed Sublime and what packages you have installed (if any) that use dependencies.

The first four items you see when you do this are common to all Sublime installations:

  1. The installation folder for Sublime, where the Python files that make up the public facing API for Sublime plugins are stored; this is what makes things like import sublime work in your plugin
  2. The zip file that provides the core Python classes and runtime for the interpreter; this ships with Sublime and is where things like the os module and built-ins come from
  3. A special folder set aside in the Lib folder (see below for more info on this)
  4. The Packages folder for Sublime; although it’s not immediately obvious, this also covers packages that are installed as sublime-package files. See this video if you’re unfamiliar with the different places and ways packages can be installed in Sublime. The important thing to note is that all packages, regardless of where they’re stored, are mapped by the Sublime resource system so that they logically appear here.

Depending on what packages you have installed in Sublime, and if any of them have any dependencies that they use, you will see other items here as well, which are the result of Package Control installing dependencies.

Generally speaking, in order to use any external third party Python library, it needs to be installed in one of the first four locations that you see here, or somewhere below them, and realistically that really just covers items #3 and #4 (modifying the Python runtime or fiddling inside of the Sublime installation folder are definite no-no’s).

Everything else involves putting the code “somewhere” and then modifying the sys.path to include that location. Choosing such a location is tricky; you want it to be somewhere associated with a user’s Sublime configuration, in an accessible place, without worrying about clobbering anything. The most obvious place is inside of your Package, which is already covered by item #4 (and doesn’t require fiddling with sys.path).

Method 1: Vendoring

The first way to do something like this would be to vendor the library inside of your package; essentially you would take the installed version of the code and drop it into a folder inside of your package, and then you can use a relative import to load it in your own plugin code.

For example in your case, if your package was named MyPackage, you would create a folder inside of it named docx, and inside of there put the source for the package, then in your code you can use a relative import like from .docx import Document.

An issue with this method is that many python modules contain more than one file and may assume that they can use import to directly load other files within themselves. If that’s the case then you need to modify the code in the package to make sure that it still loads properly, which can be non-trivial to accomplish.

On the plus side, this allows you to directly specify what version you’re using, if that’s important.

Method 2: Package Control Dependency

A second way to do this is to create a Package Control dependency. In that case what you’re doing is creating a specially created package that contains the library and adding it to Package Control. In MyPackage you would indicate that you depend on the dependency and Package Control will install it when your package gets installed (if it’s not already present).

The upside in this case is that you can say from docx import Document like you would expect to, and there is less manipulation of the code needed (possibly as little as just moving the file layout around a bit).

This also makes the library available for other packages that might need it, which is a nice bonus as well.

Method 3: Use the Lib Folder

The folder inside of the Lib folder from item #3 above is directly on the sys.path, so depending on how the thing you’re trying to use is set up, you can just drop its folder directly inside of that folder and it immediately becomes available to all plugins.

For example, if you created $DATA/Lib/python3.3/docx/ and put the code from the library inside of it, in theory it should Just Work; from docx import Document would find the code inside of that folder the same as an external Python interpreter would.

The hand-wave here is that it may be not straight forward to figure out what the contents of the folder should ultimately be. Also there is (currently) no mechanism that will automatically put things here, so at the moment if you want to go that route the onus is on you to get the code there somehow. In your case that would likely mean writing code to do it automatically so that your users don’t need to do it manually.

Method 4: Put the code somewhere, then modify sys.path

This is a take on the previous method; somehow get a version of the package installed on the machine, and then your package modifies sys.path so that the location will be found and the code can be loaded.

Ultimately you’d determine that the best safe place to install something like this while being sure that you’re not clobbering something or putting things outside of the place where Sublime-related things go on the user’s computer would be to put the code into your Package folder.

In that regard this is the same as that method, but you’re fiddling with the sys.path at runtime (which is probably discouraged generally where possible, if I had to guess) although it does net you the potential of not having to modify code if it doesn’t use relative imports.

That post looks like it’s using the last method here; vendor the code in your package and then modify the sys.path so that it’s found. That would probably work, but would require some effort.

I’m not super familiar with Python code outside of the environment of Sublime Package development, but in the case of packages there’s not a mechanism that allows a dependency to depend on another dependency as far as I’m aware. Thus the onus would be on you as the person using the library to make sure that they’re all installed.

In this case that would mean that you are indeed going down the rabbit hole to some unknown depth on this one (though for what it’s worth, lxml is in the list of available package control dependencies).

It’s also worth mentioning that at the moment the version of Python that Sublime embeds is 3.3, which is pretty old at this point. So before going any further you need to verify that there’s a version of the package that you want to use that will work in that version of Sublime (and ditto on all of the other dependencies involved).

3 Likes

#3

To show an example of how the dependency system works end-to end, install PackageDev if you don’t aready use it (it’s a handy package and thus recommended just on general principle).

Once it’s installed, use View Package File from the command palette and open PackageDev/dependencies.json, which is a file that PackageDev includes to tell Package Control what dependencies it uses. That looks like this currently:

{
    "*": {
        ">=3000": [
            "pyyaml",
            "pathlib",
            "sublime_lib"
        ]
    }
}

This is saying that on all platforms (*) if the build of Sublime Text is 3000 or higher, the dependencies that need to be installed are pyyaml, pathlib and sublime_lib. Here different things could be required for windows versus linux or based on the build of Sublime (say for example a newer version of Sublime includes something than older one did not).

When Package Control installs Package Dev, it sees this information and then goes out and also installs the dependencies as well. As noted in the post above, as far as I’m aware this isn’t done recursively, so if any of the dependencies also rely on other things, they would need to be manually included here.

You’ll also note if you use Preferences > Browse Packages that there are now folders named pyyaml, pathlib and sublime_lib in the Packages folder where there may not have been before. That’s because Package Control actually installs dependencies as regular packages. A better place would be the Lib folder mentioned in the post above, but that is a recent-ish addition to Sublime and didn’t exist back when Package Control was created.

Sublime considers any py file in the top level of a package a plugin and tries to automatically load it. That’s not desirable for things like libraries, so if you look inside of sublime_lib you’ll notice that there aren’t any top level py files present.

There is however a folder named st3, and inside of that is a folder (confusingly) named sublime_lib, and inside of that is what appears to be the content of the actual package.

This is an artifact how how Package Control installs dependencies. Dependencies can be platform specific, architecture specific and Sublime version specific. So the author of the dependency would have one or more folders inside of the dependency that contain the specific versions needed, and Package Control will keep the appropriate one for the install and discard the rest. The other files in the top level are at the discretion of the dependency author.

In this case sublime_lib requires Sublime Text 3, so the st3 folder indicates to Package Control the one that should be used here. If one platform required more code than another, there might be a st3_linux and st3_windows, for example. Similarly if native libraries were needed, you’d see things like st3_linux_x64 and st3_linux (plus many others).

If you perform the steps from the post above to check the value of sys.path, among the list of paths you’ll see are the following (here $DATA represents wherever Sublime is storing your user specific information):

  • ‘$DATA/Packages/sublime_lib/st3’
  • ‘$DATA/Packages/pathlib/all’
  • ‘$DATA/Packages/pyyaml/st3’

Now we can see that the sys.path has an entry in it that points directly at the place where the inner sublime_lib exists. Thus you can import from sublime_lib and it will pull code from inside of that folder.

The entry in sys.path was put there by Package Control. In order to see how that worked, we first need to point out that the order of items in the sys.path is important because it’s consulted in order. Thus if you’re going to add things to it, you need to be careful of what order you put them in.

This is where the idea of the load order that is mentioned in the Package Control dependency docs comes into play. Using View Package File, you can find and open sublime_lib/.sublime-dependency, which contains just the text 01, which is the load order for this dependency.

If you use View Package File from the command palette and enter the filter 0_package_control_loader/, you can see just the contents of that package, which is primarily a list of files with a numeric prefix. One of those files is 01-sublime_lib.py, and it contains some code that ultimately adds the location of sublime_lib to the sys.path.

Packages are loaded in lexical(ish) order, and plugins inside of packages are as well. So Package Control creates the 0_package_control_loader package with that name to make sure that it’s the first user-installed package loaded, and when it’s loaded the plugins are loaded in lexical (numeric) order, so the sys.path gets modified for each dependency based on the load order that they declare.

4 Likes

#4

@OdatNurd Thank you so much for all of this information – It’ll take me a little time/attention to really read through and grok it all, but just from skimming it I can tell there’s some good stuff here… Here are a couple of things I’ve noticed right off the bat…

Thank you for this link – I swear I checked the ‘dependencies.json’ file for both “docx” & “lxml” and didn’t find either, but of course now that I look again I see “lxml” there just fine :stuck_out_tongue_closed_eyes:

Thank you for this, and all the stuff you said about making and submitting a dependency package – I feel like I’ve got a good starting point for moving forward now :+1:

1 Like