Previous | Contents | Index | Next

Appendix H: PuTTY authentication plugin protocol

This appendix contains the specification for the protocol spoken over local IPC between PuTTY and an authentication helper plugin.

If you already have an authentication plugin and want to configure PuTTY to use it, see section 4.22.3 for how to do that. This appendix is for people writing new authentication plugins.

H.1 Requirements

The following requirements informed the specification of this protocol.

Automate keyboard-interactive authentication. We're motivated in the first place by the observation that the general SSH userauth method ‘keyboard-interactive’ (defined in [RFC4256]) can be used for many kinds of challenge/response or one-time-password styles of authentication, and in more than one of those, the necessary responses might be obtained from an auxiliary network connection, such as an HTTPS transaction. So it's useful if a user doesn't have to manually copy-type or copy-paste from their web browser into their SSH client, but instead, the process can be automated.

Be able to pass prompts on to the user. On the other hand, some userauth methods can be only partially automated; some of the server's prompts might still require human input. Also, the plugin automating the authentication might need to ask its own questions that are not provided by the SSH server. (For example, ‘please enter the master key that the real response will be generated by hashing’.) So after the plugin intercepts the server's questions, it needs to be able to ask its own questions of the user, which may or may not be the same questions sent by the server.

Allow automatic generation of the username. Sometimes, the authentication method comes with a mechanism for discovering the username to be used in the SSH login. So the plugin has to start up early enough that the client hasn't committed to a username yet.

Future expansion route to other SSH userauth flavours. The initial motivation for this protocol is specific to keyboard-interactive. But other SSH authentication methods exist, and they may also benefit from automation in future. We're making no attempt here to predict what those methods might be or how they might be automated, but we do need to leave a space where they can be slotted in later if necessary.

Minimal information loss. Keyboard-interactive prompts and replies should be passed to and from the plugin in a form as close as possible to the way they look on the wire in SSH itself. Therefore, the protocol resembles SSH in its data formats and marshalling (instead of, for example, translating from SSH binary packet style to another well-known format such as JSON, which would introduce edge cases in character encoding).

Half-duplex. Simultaneously trying to read one I/O stream and write another adds a lot of complexity to software. It becomes necessary to have an organised event loop containing select or WaitForMultipleObjects or similar, which can invoke the handler for whichever event happens soonest. There's no need to add that complexity in an application like this, which isn't transferring large amounts of bulk data or multiplexing unrelated activities. So, to keep life simple for plugin authors, we set the ground rule that it must always be 100% clear which side is supposed to be sending a message next. That way, the plugin can be written as sequential code progressing through the protocol, making simple read and write calls to receive or send each message.

Communicate success/failure, to facilitate caching in the plugin. A plugin might want to cache recently used data for next time, but only in the case where authentication using that data was actually successful. So the client has to tell the plugin what the outcome was, if it's known. (But this is best-effort only. Obviously the plugin cannot depend on hearing the answer, because any IPC protocol at all carries the risk that the other end might crash or be killed by things outside its control.)

H.2 Transport and configuration

Plugins are executable programs on the client platform.

The SSH client must be manually configured to use a plugin for a particular connection. The configuration takes the form of a command line, including the location of the plugin executable, and optionally command-line arguments that are meaningful to the particular plugin.

The client invokes the plugin as a subprocess, passing it a pair of 8-bit-clean pipes as its standard input and output. On those pipes, the client and plugin will communicate via the protocol specified below.

H.3 Data formats and marshalling

This protocol borrows the low-level data formatting from SSH itself, in particular the following wire encodings from [RFC4251] section 5:

byte
An integer between 0 and 0xFF inclusive, transmitted as a single byte of binary data.
boolean
The values ‘true’ or ‘false’, transmitted as the bytes 1 and 0 respectively.
uint32
An integer between 0 and 0xFFFFFFFF inclusive, transmitted as 4 bytes of binary data, in big-endian (‘network’) byte order.
string
A sequence of bytes, preceded by a uint32 giving the number of bytes in the sequence. The length field does not include itself. For example, the empty string is represented by four zero bytes (the uint32 encoding of 0); the string "AB" is represented by the six bytes 0,0,0,2,'A','B'.

Unlike SSH itself, the protocol spoken between the client and the plugin is unencrypted, because local inter-process pipes are assumed to be secured by the OS kernel. So the binary packet protocol is much simpler than SSH proper, and is similar to SFTP and the OpenSSH agent protocol.

The data sent in each direction of the conversation consists of a sequence of messages exchanged between the SSH client and the plugin. Each message is encoded as a string. The contents of the string begin with a byte giving the message type, which determines the format of the rest of the message.

H.4 Protocol versioning

This protocol itself is versioned. At connection setup, the client states the highest version number it knows how to speak, and then the plugin responds by choosing the version number that will actually be spoken (which may not be higher than the client's value).

Including a version number makes it possible to make breaking changes to the protocol later.

Even version numbers represent released versions of this spec. Odd numbers represent drafts or development versions in between releases. A client and plugin negotiating an odd version number are not guaranteed to interoperate; the developer testing the combination is responsible for ensuring the two are compatible.

This document describes version 2 of the protocol, the first released version. (The initial drafts had version 1.)

H.5 Overview and sequence of events

At the very beginning of the user authentication phase of SSH, the client launches the plugin subprocess, if one is configured. It immediately sends the PLUGIN_INIT message, telling the plugin some initial information about where the SSH connection is to.

The plugin responds with PLUGIN_INIT_RESPONSE, which may optionally tell the SSH client what username to use.

The client begins trying to authenticate with the SSH server in the usual way, using the username provided by the plugin (if any) or alternatively one obtained via its normal (non-plugin) policy.

The client follows its normal policy for selecting authentication methods to attempt. If it chooses a method that this protocol does not cover, then the client will perform that method in its own way without consulting the plugin.

However, if the client and server decide to attempt a method that this protocol does cover, then the client sends PLUGIN_PROTOCOL specifying the SSH protocol id for the authentication method being used. The plugin responds with PLUGIN_PROTOCOL_ACCEPT if it's willing to assist with this auth method, or PLUGIN_PROTOCOL_REJECT if it isn't.

If the plugin sends PLUGIN_PROTOCOL_REJECT, then the client will proceed as if the plugin were not present. Later, if another auth method is negotiated (either because this one failed, or because it succeeded but the server wants multiple auth methods), the client may send a further PLUGIN_PROTOCOL and try again.

If the plugin sends PLUGIN_PROTOCOL_ACCEPT, then a protocol segment begins that is specific to that auth method, terminating in either PLUGIN_AUTH_SUCCESS or PLUGIN_AUTH_FAILURE. After that, again, the client may send a further PLUGIN_PROTOCOL.

Currently the only supported method is ‘keyboard-interactive’, defined in [RFC4256]. Once the client has announced this to the server, the followup protocol is as follows:

Each time the server sends an SSH_MSG_USERAUTH_INFO_REQUEST message requesting authentication responses from the user, the SSH client translates the message into PLUGIN_KI_SERVER_REQUEST and passes it on to the plugin.

At this point, the plugin may optionally send back PLUGIN_KI_USER_REQUEST containing prompts to be presented to the actual user. The client will reply with a matching PLUGIN_KI_USER_RESPONSE after asking the user to reply to the question(s) in the request message. The plugin can repeat this cycle multiple times.

Once the plugin has all the information it needs to respond to the server's authentication prompts, it sends PLUGIN_KI_SERVER_RESPONSE back to the client, which translates it into SSH_MSG_USERAUTH_INFO_RESPONSE to send on to the server.

After that, as described in [RFC4256], the server is free to accept authentication, reject it, or send another SSH_MSG_USERAUTH_INFO_REQUEST. Each SSH_MSG_USERAUTH_INFO_REQUEST is dealt with in the same way as above.

If the server terminates keyboard-interactive authentication with SSH_MSG_USERAUTH_SUCCESS or SSH_MSG_USERAUTH_FAILURE, the client informs the plugin by sending either PLUGIN_AUTH_SUCCESS or PLUGIN_AUTH_FAILURE. PLUGIN_AUTH_SUCCESS is sent when that particular authentication method was successful, regardless of whether the SSH server chooses to request further authentication afterwards: in particular, SSH_MSG_USERAUTH_FAILURE with the ‘partial success’ flag (see [RFC4252] section 5.1) translates into PLUGIN_AUTH_SUCCESS.

The plugin's standard input will close when the client no longer requires the plugin's services, for any reason. This could be because authentication is complete (with overall success or overall failure), or because the user has manually aborted the session in mid-authentication, or because the client crashed.

H.6 Message formats

This section describes the format of every message in the protocol.

As described in section H.3, every message starts with the same two fields:

The length field does not include itself, but does include the type code.

The following subsections each give the format of the remainder of the message, after the type code.

The type codes themselves are defined here:

#define PLUGIN_INIT                   1
#define PLUGIN_INIT_RESPONSE          2
#define PLUGIN_PROTOCOL               3
#define PLUGIN_PROTOCOL_ACCEPT        4
#define PLUGIN_PROTOCOL_REJECT        5
#define PLUGIN_AUTH_SUCCESS           6
#define PLUGIN_AUTH_FAILURE           7
#define PLUGIN_INIT_FAILURE           8

#define PLUGIN_KI_SERVER_REQUEST     20
#define PLUGIN_KI_SERVER_RESPONSE    21
#define PLUGIN_KI_USER_REQUEST       22
#define PLUGIN_KI_USER_RESPONSE      23

If this protocol is extended to be able to assist with further auth methods, their message type codes will also begin from 20, overlapping the codes for keyboard-interactive.

H.6.1 PLUGIN_INIT

Direction: client to plugin

When: the first message sent at connection startup

What happens next: the plugin will send PLUGIN_INIT_RESPONSE or PLUGIN_INIT_FAILURE

Message contents after the type code:

H.6.2 PLUGIN_INIT_RESPONSE

Direction: plugin to client

When: response to PLUGIN_INIT

What happens next: the client will send PLUGIN_PROTOCOL, or perhaps terminate the session (if no auth method is ever negotiated that the plugin can help with)

Message contents after the type code:

H.6.3 PLUGIN_INIT_FAILURE

Direction: plugin to client

When: response to PLUGIN_INIT

What happens next: the session is over

Message contents after the type code:

H.6.4 PLUGIN_PROTOCOL

Direction: client to plugin

When: sent after PLUGIN_INIT_RESPONSE, or after a previous auth phase terminates with PLUGIN_AUTH_SUCCESS or PLUGIN_AUTH_FAILURE

What happens next: the plugin will send PLUGIN_PROTOCOL_ACCEPT or PLUGIN_PROTOCOL_REJECT

Message contents after the type code:

H.6.5 PLUGIN_PROTOCOL_REJECT

Direction: plugin to client

When: sent after PLUGIN_PROTOCOL

What happens next: the client will either send another PLUGIN_PROTOCOL or terminate the session

Message contents after the type code:

H.6.6 PLUGIN_PROTOCOL_ACCEPT

Direction: plugin to client

When: sent after PLUGIN_PROTOCOL

What happens next: depends on the auth protocol agreed on. For keyboard-interactive, the client will send PLUGIN_KI_SERVER_REQUEST or PLUGIN_AUTH_SUCCESS or PLUGIN_AUTH_FAILURE. No other method is specified.

Message contents after the type code: none.

H.6.7 PLUGIN_KI_SERVER_REQUEST

Direction: client to plugin

When: sent after PLUGIN_PROTOCOL, or after a previous PLUGIN_KI_SERVER_RESPONSE, when the SSH server has sent SSH_MSG_USERAUTH_INFO_REQUEST

What happens next: the plugin will send either PLUGIN_KI_USER_REQUEST or PLUGIN_KI_SERVER_RESPONSE

Message contents after the type code: the exact contents of the SSH_MSG_USERAUTH_INFO_REQUEST just sent by the server. See [RFC4256] section 3.2 for details. The summary:

H.6.8 PLUGIN_KI_SERVER_RESPONSE

Direction: plugin to client

When: response to PLUGIN_KI_SERVER_REQUEST, perhaps after one or more intervening pairs of PLUGIN_KI_USER_REQUEST and PLUGIN_KI_USER_RESPONSE

What happens next: the client will send a further PLUGIN_KI_SERVER_REQUEST, or PLUGIN_AUTH_SUCCESS or PLUGIN_AUTH_FAILURE

Message contents after the type code: the exact contents of the SSH_MSG_USERAUTH_INFO_RESPONSE that the client should send back to the server. See [RFC4256] section 3.4 for details. The summary:

H.6.9 PLUGIN_KI_USER_REQUEST

Direction: plugin to client

When: response to PLUGIN_KI_SERVER_REQUEST, if the plugin cannot answer the server's auth prompts without presenting prompts of its own to the user

What happens next: the client will send PLUGIN_KI_USER_RESPONSE

Message contents after the type code: exactly the same as in PLUGIN_KI_SERVER_REQUEST (see section H.6.7).

H.6.10 PLUGIN_KI_USER_RESPONSE

Direction: client to plugin

When: response to PLUGIN_KI_USER_REQUEST

What happens next: the plugin will send PLUGIN_KI_SERVER_RESPONSE, or another PLUGIN_KI_USER_REQUEST

Message contents after the type code: exactly the same as in PLUGIN_KI_SERVER_RESPONSE (see section H.6.8).

H.6.11 PLUGIN_AUTH_SUCCESS

Direction: client to plugin

When: sent after PLUGIN_KI_SERVER_RESPONSE, or (in unusual cases) after PLUGIN_PROTOCOL_ACCEPT

What happens next: the client will either send another PLUGIN_PROTOCOL or terminate the session

Message contents after the type code: none

H.6.12 PLUGIN_AUTH_FAILURE

Direction: client to plugin

When: sent after PLUGIN_KI_SERVER_RESPONSE, or (in unusual cases) after PLUGIN_PROTOCOL_ACCEPT

What happens next: the client will either send another PLUGIN_PROTOCOL or terminate the session

Message contents after the type code: none

H.7 References

[RFC4251] RFC 4251, ‘The Secure Shell (SSH) Protocol Architecture’.

[RFC4252] RFC 4252, ‘The Secure Shell (SSH) Authentication Protocol’.

[RFC4256] RFC 4256, ‘Generic Message Exchange Authentication for the Secure Shell Protocol (SSH)’ (better known by its wire id ‘keyboard-interactive’).


If you want to provide feedback on this manual or on the PuTTY tools themselves, see the Feedback page.

[PuTTY development snapshot 2024-04-20.6b10eaa]