7.4.10. Smartscale Routing

SmartScale allows you to read your data, as much as possible, from a Replica data source.

In this read-write splitting mode, the connector intelligently determines if Replicas are up-to-date with respect to the Primary, and selects them in such a way that read operations are always strictly consistent with the last write of their current session. Sessions are per-connector-instance, in-memory objects that allow different connections to share SmartScale benefits: by providing the same session id, two connections will be able to see each other's write operations consistently.

Possible session ids are:

  • DATABASE: applications will see write operations made to the same database as it is connected to. Reads from other databases might be outdated depending on the Replica latency

  • USER: all connections that use the same user will read data consistent with the writes made by the current user. Other users data might be outdated.

  • CONNECTION: only writes made by the current connection are guaranteed to be read consistently. Writes from other connections might be outdated

  • PROVIDED_IN_DBNAME: Allows you to specify a variable sessionid in the database connection string . An application, typically PHP, can pass its own session id to make smart scale even more efficient.

Typical use cases

  • Applications which can use this level of consistency typically do relatively few writes and many reads. The writes that are performed can be considered to be in a 'silo' of their own, that is, a given application 'session' only writes and reads its own data and is not concerned with the data read/written by other application 'sessions'.

  • PHP applications are good candidates for SmartScale since PHP has embedded session IDs that can be passed at connection time.

  • Web based applications with user profiles match the scenario where users will update their own profile and want to see their modifications right away, but can accept latency on other users profiles.

Comparison with Direct Reads

  • Smart Scale allow session consistency, while Direct Reads always read from Replica, no matter whether data is up-to-date.

  • However, the cost for consistency appears at the performance level, since the Connector constantly needs to check Replica progress.

  • If your application needs to see the data it just wrote, use SmartScale

  • If your application does a lot of small reads that do not need to be up-to-date, use Direct Reads

Limitations

  • Prepared Statements - Prepared statements will need to be enclosed between transaction boundaries in order to work correctly with read/write splitting. This way, they will always execute on the Primary. Note that all prepared statements will become invalid upon switches or failover

  • Ephemeral objects - Temporary tables, session variables and other objects that are connection specific will not be accessible when reading data using SmartScale. If you need to use these ephemeral object, you should either add a "for update" statement in you selects or avoid using SmartScale

  • Read/write functions - Functions that create or modify data in the database should never be called with a simple SELECT statement. Always add "for update" a the end of the select call. Example:

    SELECT my_function_that_writes('param') FOR UPDATE
  • Currently, it is not recommended to use SMARTSCALE in conjunction with Parallel Apply. This is due to progress only being measured against the slowest channel.

    It is hoped that this will be resolved in a future release.

7.4.10.1. Specifying the Session ID

While the three first keywords (DATABASE, USER and CONNECTION) are connector-wide (a single connector instance will use these session ids for all connections), it is possible to configure the connector to allow a free string session ID. This string will have to be passed through the database name as {db_name}?sessionId={sessionID}. For example, when using a database named "test" and a session ID number 1234, the database name passed to the connector will be:

test?sessionId=1234

With mysql command line utility, the connection command will look like:

mysql -h connectorHost -u user -ppass -P 3306 test?sessionId=1234

JDBC clients will have to pass this session ID with a special tag, as follows:

jdbc:mysql://connector_host:3306/dbname@sessionId=1234?otherJdbcDriverOption=value

In order to use this feature, the special session id PROVIDED_IN_DBNAME needs to be specified at installation time

Also note that a session ID specified in the database name string will override the default provided in the connector configuration file. You can thus have a default session ID set for applications that can't specify it dynamically, and still allow other applications to connect with their own session ID variable.

7.4.10.2. Enabling SmartScale Routing

To enable SmartScale routing, configure the dataservice using the --connector-smartscale option. The session ID identification should also be specified by using the --connector-smartscale-sessionid option with one of the following values DATABASE, USER, CONNECTION or PROVIDED_IN_DBNAME:

shell> tpm configure alpha \
... \
--connector-smartscale=true \
--connector-smartscale-sessionid={DATABASE|USER|CONNECTION|PROVIDED_IN_DBNAME}

In this mode, any client application can open a connection to the connector, and queries will automatically be redirected according to the SQL statement content.

In addition, all users that connect to the database must be granted the REPLICATION CLIENT privilege so that the user can compare the current replicator progress for session information. This can be granted using:

mysql> GRANT REPLICATION CLIENT ON *.* to app_user@'%'

7.4.10.3. Disabling SmartScale Routing

To disable SmartScale routing if it has been previously configured:

shell> tpm configure alpha --connector-smartscale=false

7.4.10.4. SmartScale Exploit

When using DATABASE session ID, you can bypass session consistency to read from available Replicas by simply connecting to one database and reading from another. For example:

shell> mysql -u... -p... -P3306 db1
mysql> select * from db2.user

As long as no update is made on db1 in the meantime, the select will be executed on a Replica (if available)