Motivation

Today, I want to dive into a topic that I have been wondering for a long time:

How does LSP work?

I have some basic understanding of what LSP is:

LSP is protocol designed to standardized the request and response format between a client (editor like Neovim) and a server (language server like gopls). This helps decouple the editors and languages. An editor can switch to another language server with minimal changes. On the other side, the same language server can be used in any editors with LSP support.

This idea is brilliant. It reduce the complexity from MxN to M+N:1

LSP from MxN to M+N

But what is inside this middle layer? What is the standardized request and response format?

JSON-RPC

Looking at the overview in the official page, the answer is obvious: JSON-RPC is the format used for communication between clients and servers.

Language server sequence

For example, when we perform “Go to definition”, the editor sends a request to the language server with JSON body:

{
    "jsonrpc": "2.0",
    "id" : 1,
    "method": "textDocument/definition",
    "params": {
        "textDocument": {
            "uri": "file:///p%3A/mseng/VSCode/Playgrounds/cpp/use.cpp"
        },
        "position": {
            "line": 3,
            "character": 12
        }
    }
}

The language server then responses with JSON body:

{
    "jsonrpc": "2.0",
    "id": 1,
    "result": {
        "uri": "file:///p%3A/mseng/VSCode/Playgrounds/cpp/provide.cpp",
        "range": {
            "start": {
                "line": 0,
                "character": 4
            },
            "end": {
                "line": 0,
                "character": 11
            }
        }
    }
}

This looks like a HTTP API request and response with JSON body. But actually JSON-RPC is not (necessarily) HTTP API.

JSON-RPC = RPC + JSON

RPC

RPC stands for Remote Procedure Call. It allows us to call a procedure / function / method defined elsewhere.

For example, I have implemented the function add() and full_name() in Python:

def add(a, b):
    return a + b

def full_name(**kwargs):
    first_name = kwargs["first_name"]
    last_name = kwargs["last_name"]
    return f"{first_name} {last_name}"

Normally, we call the function in the same Python program by typing the function name and arguments:

result = add(2, 3)
result = full_name(first_name="Haskell", last_name="Curry")

But we can also call the function remotely. A remote call can come from:

RPC is a concept to call a function remotely. It does not have a specification. Therefore, there are many implementations to realize this concept. JSON-RPC and gRPC are two examples. Different implementations have different specification and their unique capabilities. But surely they all have some mechanism to achieve RPC.

JSON

To call a function, we need provide two essential input:

The follow-up questions are:

  1. How to encode the input - data format concern
  2. How to send the input to the process that actually run the function - transport concern

JSON-RPC only specifies the data format. It is up to the JSON-RPC client and server to agree with the transport.

It is transport agnostic in that the concepts can be used within the same process, over sockets, over http, or in many various message passing environments. It uses JSON (RFC 4627) as data format.

Look back at the “Go to definition” request. It makes more sense to me now:

{
    "jsonrpc": "2.0",
    "id" : 1,
    "method": "textDocument/definition",
    "params": {
        "textDocument": {
            "uri": "file:///p%3A/mseng/VSCode/Playgrounds/cpp/use.cpp"
        },
        "position": {
            "line": 3,
            "character": 12
        }
    }
}

This is a RPC request, using JSON as the data format to encode the request. It is JSON-RPC so it is transport agnostic.

The meaning of each field:


This is the function name. The server should have registered a method named textDocument/definition so we can call remotely.


This is the function arguments. textDocument and position are named arguments.

From the spec:

by-name: params MUST be an Object

If the arguments is positional (i.e. unnamed), it looks like "params": [...]

by-position: params MUST be an Array


jsonrpc is the protocol version, pretty straight forward.


From the spec:

Request Object

An identifier established by the Client that MUST contain a String, Number, or NULL value if included. If it is not included it is assumed to be a notification.

Response object

It MUST be the same as the value of the id member in the Request Object

So id is a field that let the client match their requests with the server’s responses because the response can be received out of order.

There is an interesting word above: notification.

Notification

If id is absent, the JSON represents a notification. Look back at the LSP overview graph. There are two types of JSON-RPC: Request and Notification.

Language server sequence

After sending a notification, the client does not receive any response or acknowledgement. It just notify the server as the name suggests.

Acting as client and server at the same time

Another notable thing in the graph is that the language server can also send notification to the editor. This means that both the editor and language server act as a dual role:

Hint: In the LSP specification, you can search the text from the client to the server and from the server to the client to find the LSP requests and notifications sent from each direction respectively.

The JSON-RPC spec has mentioned the possibility of dual role:

The Client is defined as the origin of Request objects and the handler of Response objects. The Server is defined as the origin of Response objects and the handler of Request objects.

One implementation of this specification could easily fill both of those roles, even at the same time, to other different clients or the same client. This specification does not address that layer of complexity.

JSON-RPC over http

Now I have sufficient understanding of JSON-RPC, which is the data format used by LSP client and server for communication. Before diving into LSP, I want to play with JSON-RPC to know how to make a JSON-RPC server.

Using the Python library json-rpc to make a JSON-RPC server is easy.

from werkzeug.wrappers import Request, Response
from werkzeug.serving import run_simple

from jsonrpc import JSONRPCResponseManager
from jsonrpc.dispatcher import Dispatcher

def add(a, b):
    return a + b

def full_name(**kwargs):
    first_name = kwargs["first_name"]
    last_name = kwargs["last_name"]
    return f"{first_name} {last_name}"

@Request.application
def application(request):
    dispatcher = Dispatcher({
        "add": add,
        "full_name": full_name,
    })
    response = JSONRPCResponseManager.handle(
        request.data, dispatcher)
    return Response(response.json, mimetype='application/json')

if __name__ == '__main__':
    print("==== Local call ====")    
    print(add(2, 3))
    print(full_name(first_name="Haskell", last_name="Curry"))

    print("==== Remote call via HTTP: POST http://localhost:4000/jsonrpc ====")
    run_simple('localhost', 4000, application)

There are two main components in the program: a HTTP server and a RPC handler

The HTTP server is responsible for

The RPC handler is responsible for


RPC request for add:

curl --request POST \
  --url http://localhost:4000/jsonrpc \
  --header 'content-type: application/json' \
  --data '{"jsonrpc": "2.0", "method": "add", "params": [2, 3], "id": 1}'

Output:

{"result": 5, "id": 1, "jsonrpc": "2.0"}

RPC request for full_name:

curl --request POST \
  --url http://localhost:4000/jsonrpc \
  --header 'content-type: application/json' \
  --data '{"jsonrpc": "2.0", "method": "full_name", "params": {"first_name": "Haskell", "last_name": "Curry"}, "id": 2}'

Output:

{"result": "Haskell Curry", "id": 2, "jsonrpc": "2.0"}

JSON-RPC over stdio

JSON-RPC is transport agnostic so the HTTP layer is actually not mandatory. Also, considering LSP servers usually run on the same computer, it is better to use an efficient transport method available locally.

The LSP spec recommends language servers to support stdio as one of the communication channel:

Servers usually support different communication channels (e.g. stdio, pipes, …)…

stdio: uses stdio as the communication channel. …

gopls also use stdio by default.

To make a JSON-RPC server using stdio, few minor changes are needed:

import sys

from jsonrpc import JSONRPCResponseManager
from jsonrpc.dispatcher import Dispatcher

def add(a, b):
    return a + b

def full_name(**kwargs):
    first_name = kwargs["first_name"]
    last_name = kwargs["last_name"]
    return f"{first_name} {last_name}"

def run_stdio():
    dispatcher = Dispatcher({
        "add": add,
        "full_name": full_name,
    })

    for line in sys.stdin:
        response = JSONRPCResponseManager.handle(
            line, dispatcher)
        print(response.json)

if __name__ == '__main__':
    print("==== Local call ====")
    print(add(2, 3))
    print(full_name(first_name="Haskell", last_name="Curry"))

    print("==== Remote call via stdio: read request from stdin, output response to stdout ====")
    run_stdio()

Syntax:

--> data sent to Server via stdin
<-- data sent to Client via stdout

RPC call over stdio:

--> {"jsonrpc": "2.0", "method": "add", "params": [2, 3], "id": 1}
<-- {"result": 5, "id": 1, "jsonrpc": "2.0"}
--> {"jsonrpc": "2.0", "method": "full_name", "params": {"first_name": "Haskell", "last_name": "Curry"}, "id": 2}
<-- {"result": "Haskell Curry", "id": 2, "jsonrpc": "2.0"}

Ready for more LSP

JSON-RPC is a simple data format. It can be used over any transport. For LSP, stdio is a common choice.

Now, I should be able to imitate a JSON-RPC client and make a request to a LSP server like gopls. For the next step, I will use JSON-RPC to probe into gopls in order to learn more about LSP. But let me take a break first…


  1. M editors and N languages ↩︎