Sublime Forum

Retrieving inserted and deleted text

#1

I am experimenting with a plugin and I’d like to capture every keystroke that happens in the editor (inserts and deletes of single characters and multiple characters through pastes and cuts). I am able to grab individual inserts using on_modified and retrieving the view’s selected regions (with the help of @fico over on stackoverflow- thanks!):

import sublime, sublime_plugin

class EventListener ( sublime_plugin.EventListener ):

    def on_modified ( self, view ):

        selectedRegions = view.sel()

        for region in selectedRegions:

            row, column = view.rowcol ( region.a )
            line = row + 1
            lastCharacter_Region = sublime.Region ( region.a - 1, region.a )
            lastCharacter = view.substr ( lastCharacter_Region )

            print ( "line: " + str ( line ) + "   col: " + str ( column ) + "   char: " + lastCharacter )

However, this only works for the last character inserted (region.a - 1, region.a). I’d like to be able to:

  • know when multiple characters are inserted from a paste
  • know when a single character is deleted
  • know when multiple characters are deleted (and where the group starts and ends)

The regions that are returned all have the same a and b values and the size is 0 so I’m not sure how to figure out how many new characters were added or deleted.

2 Likes

[Proof Of Concept] Visual Progress Bar
Techniques for listening to changes to a region(s)?
What is the easiest way to clear the view?
Highlight changed rows
#2

region.a & region.b will be the same when there is no selection ( caret position only ). If you have text selected, a & b will be the start & end of the selection respectively.
 



 
To capture the clipboard contents, you can map your ctrl + v key-binding to something like this:

class LogPasteCommand ( sublime_plugin.TextCommand ):

	def run ( self, edit ):

		view      = self.view
		clipBoard = sublime.get_clipboard()

		selectedRegions = view.sel()

		for region in selectedRegions:

				row_A, column_A = view.rowcol ( region.a )
				row_B, column_B = view.rowcol ( region.b )
				lines   = str ( row_A + 1 ) + "-" + str ( row_B + 1 )
				columns = str ( column_A )  + "-" + str ( column_B )

				view.replace ( edit, region, clipBoard )

				print ( "line: " + lines + "   col: " + columns + "   clip: " + clipBoard )

		print ( clipBoard )

 
Depending on what you want to see in the log, you could:

  • factor in the length of clipBoard to the printed lines & columns
  • run a check to see if row_A == row_B and/or column_A == column_B, and eliminate the duplicate value if they return True

A similar approach could be implemented to replace the default delete, backspace, & replace commands.
 



 
Also, there is an issue you will have to create a workaround for.
LogPasteCommand will be printed to the log in addition to the on_modified print.

You’ll probably want to figure out a way to disable on_modified if one of the other commands are executed.

I recommend implementing something like:

  • global variable onModified_PrintEnabled
  • @ non-on_modified commands:
  • onModified_PrintEnabled = False
  • @ on_modified:
  • if onModified_PrintEnabled == False: onModified_PrintEnabled = True, return
  • else: run the script
1 Like

#3

Note:

You’ll run into the same issue as paste, delete, & backspace if you use any plugins that insert text programatically ( TextPastry, ASCII Decorator, etc. ).

I posted a few thoughts at another thread about a potential workaround, but have not yet had time to implement it.

1 Like

#4

further to the amazing info that fico has provided, you can use the on_selection_modified event to determine when the selection has changed, and keep a record of it, so that when the user types, you will know what the selection was at the time, and therefore what characters were replaced etc. hope that makes sense.

2 Likes

#5

Out of curiosity, what is your intended application for this?

I ask because the data you are acquiring will build up pretty quickly and require a significant amount of parsing to be useful.
 



 
What benefit does this method offer you over diffing?

EG:

Diff

@ Compare Side-By-Side

1 Like

#6

My intent is to capture and store all of the individual changes and store them in a database. I would then use this data to recreate multiple programming sessions and allow the user to watch them and create a narrative about how the code has evolved. The user can add text based comments and draw pictures describing how the code works. The goal is to make it easy for developers to teach and learn from each other.

Here are a couple of videos of my prototype so far.

Basic playback tools:

A complete narrative:

For the prototype I am using the web based ace editor but to move forward I need to use a desktop editor. An older version of this tool uses an eclipse plugin but I wanted to build a plugin for a lightweight editor too.

I am looking for feedback so if you have any comments I would like to hear them!

4 Likes

#7

looks awesome! what a great idea! a similar concept to https://asciinema.org/ which is for Terminal session recording and playback :slightly_smiling:

1 Like

#8

Dude, that’s a pretty awesome idea. :+1:

I recommend adding a timestamp to the data that you’re capturing, so that you can group multi-caret events together and have them update in parallel.
 



 
This would be cool to see replicated in SublimeText also.   It would allow for fullscreen viewing, along with user preferences like theme, color scheme, syntax highlighting, etc.   You could also make use of the show function to scroll along with the edits.   Markdown popups could be used to show relevant notes within specified time periods.

Not sure if it could be pulled off, I haven’t worked with or seen any kind of timer functions in ST.

If it can’t be done in ST, maybe LightTable or Atom would be able to handle it. LT has some pretty crazy features, I wouldn’t be surprised.

3 Likes

#9

the sublime module has a set_timeout method along with an async version

2 Likes

#10

How does that work?

I tried:

import sublime, sublime_plugin

class TestCommand ( sublime_plugin.TextCommand ):

	def run ( self, edit ):

		selection = self.view.sel()[0].a

		for index in range ( 0,5 ):
			sublime.set_timeout( TestCommand.insertText ( edit, self.view, selection ), 1000 )

	def insertText ( edit, view, selection ):

		view.insert ( edit, selection, "TEXT " )

but it just prints TEXT TEXT TEXT TEXT TEXT in one shot, with no delay.

Also, I tried set_async_timeout & it threw:
AttributeError: 'module' object has no attribute 'set_async_timeout'

1 Like

#11

I do grab the timestamp for each keystroke and I give the user the option to see all of the events with the same timestamp in one big block (default) or the user can choose to play out each one individually if they’d like to slow things down. This should work with multi-caret editing.

I did want the playback to happen in the browser because I’d like to make a github-like site where developers can post their ‘stories’ regardless of the editor they use (I am researching multiple editor plugins now). People would be able to go there to learn from others and, perhaps, demonstrate to potential employers how they think about problems.

Sublime is definitely one of the editors I’d like to tackle first but if it is not possible to grab all the required data I may move on to another one.

1 Like

#12

You definitely should be able to get the data you need.   It’s more involved than the examples I posted, but you can get every change ( including programmatic ) by running a full file diff on each modification.

1 Like

#13

I was hoping to avoid doing a full diff on each keystroke for performance reasons.

The ace-based editor I use in the browser prototype, for example, has handlers that just show you the changes for each insert/delete. It is very simple and quick to get the changes. My eclipse plugin also hand delivers just the changes too (no diff-ing required). I was hoping sublime would have a similar API.

1 Like

#14

a simplified version of your example to just print to the console:

import sublime, sublime_plugin
class TestCommand ( sublime_plugin.TextCommand ):
    def run ( self, edit ):
        for index in range ( 0,5 ):
            sublime.set_timeout_async(lambda: print ( ' text' ), 1000*index )
2 Likes

#15

as long as the assumption holds true that any command run will affect text only besides a cursor, I don’t think a full diff would be necessary

  1. store the cursor positions when the selection changes
  2. when the document is modified, get the cursor positions and compare them to what was stored
  3. the changes will be between the differences, whether it was a deletion or an addition

i.e.

  1. cursor at line 1 column 2
  2. user runs a command to paste ‘hello\nworld’
  3. cursor is at line 2 column 5. between these 2 positions is the inserted text

obviously it becomes more complicated with multiple cursors but you can take into account the position differences of the cursors before the one you are analysing

or am I missing something?

2 Likes

#16

That sounds like it would definitely work for inserts… what about deletes? Is it possible to know when we have an insert or a delete?

All I need is the positions of the deleted characters because I will be storing a shadow copy of the insert events but I don’t know how I would differentiate between when code is being added and when it is being deleted. Perhaps I can look at the size of the buffer to tell whether there was an insert or delete???

1 Like

#17

I would just go with if the cursor begin position is before the previous end position, then there was a delete :slight_smile:

ofc, if there was a non empty selection and it has been replaced with something else (which could be the same size, bigger or smaller), it would be a delete and an insert. But you can identify this by comparing your shadow copy to the buffer between the old cursor position and the new to see what is different and where…

I might try to build a reusable proof of concept when I get chance :wink:

1 Like

#18

What about this scenario:

  • insert some text on line 10 (of 100 total lines) cursor position is on line 10 and whatever column
  • user goes to line 90 and deletes some code

In this scenario the new cursor position is after the latest insert, which would imply an insert but it is really a delete. I am also worried about replacing 10 characters of code with a different 10 characters- the size of the buffer is the same but there were some changes.

A proof of concept would be great if you have the time!

1 Like

#19

Also check out Modific

They have a pretty well functioning model that works asynchronously.

1 Like

#20

Nice, any reason why this works with lambas but not regular functions?   If I move self.view.run_command ( "timeout_insert" ) to its own function & call the function instead of a lamba, it just jumps to the final output without any delay.

When I changed it back to inserting @ the view, the edit object expired after the first loop.

I tweaked it a bit & got this:

 

import sublime, sublime_plugin,re

global string, stringIndex

class TestCommand ( sublime_plugin.TextCommand ):

	def run ( self, edit ):

		global string, stringIndex

		string = "Testing delayed output."
		stringIndex = 0

		for index in range ( 0, len ( string ) ):
			sublime.set_timeout_async( lambda: self.view.run_command ( "timeout_insert" ), 100 * index )

class TimeoutInsertCommand ( sublime_plugin.TextCommand ):

	def run ( self, edit ):

		global string, stringIndex

		self.view.insert ( edit, self.view.sel()[0].a, string[ stringIndex ] )

		stringIndex += 1
1 Like