SmartScale allows you to read your data, as much as possible, from a Replica data source.
In this read-write splitting mode, the connector intelligently determines if Replicas are up-to-date with respect to the Primary, and selects them in such a way that read operations are always strictly consistent with the last write of their current session. Sessions are per-connector-instance, in-memory objects that allow different connections to share SmartScale benefits: by providing the same session id, two connections will be able to see each other's write operations consistently.
Possible session ids are:
DATABASE: applications will see write operations made to the same database as it is connected to. Reads from other databases might be outdated depending on the Replica latency
USER: all connections that use the same user will read data consistent with the writes made by the current user. Other users data might be outdated.
CONNECTION: only writes made by the current connection are guaranteed to be read consistently. Writes from other connections might be outdated
PROVIDED_IN_DBNAME: Allows you to specify a variable sessionid in the database connection string . An application, typically PHP, can pass its own session id to make smart scale even more efficient.
Typical use cases
Applications which can use this level of consistency typically do relatively few writes and many reads. The writes that are performed can be considered to be in a 'silo' of their own, that is, a given application 'session' only writes and reads its own data and is not concerned with the data read/written by other application 'sessions'.
PHP applications are good candidates for SmartScale since PHP has embedded session IDs that can be passed at connection time.
Web based applications with user profiles match the scenario where users will update their own profile and want to see their modifications right away, but can accept latency on other users profiles.
Comparison with Direct Reads
Smart Scale allow session consistency, while Direct Reads always read from Replica, no matter whether data is up-to-date.
However, the cost for consistency appears at the performance level, since the Connector constantly needs to check Replica progress.
If your application needs to see the data it just wrote, use SmartScale
If your application does a lot of small reads that do not need to be up-to-date, use Direct Reads
Limitations
Prepared Statements - Prepared statements will need to be enclosed between transaction boundaries in order to work correctly with read/write splitting. This way, they will always execute on the Primary. Note that all prepared statements will become invalid upon switches or failover
Ephemeral objects - Temporary tables, session variables and other objects that are connection specific will not be accessible when reading data using SmartScale. If you need to use these ephemeral object, you should either add a "for update" statement in you selects or avoid using SmartScale
Read/write functions - Functions that create or modify data in the database should never be called with a simple SELECT statement. Always add "for update" a the end of the select call. Example:
SELECT my_function_that_writes('param') FOR UPDATE
Currently, it is not recommended to use SMARTSCALE in conjunction with Parallel Apply. This is due to progress only being measured against the slowest channel.
It is hoped that this will be resolved in a future release.
While the three first keywords (DATABASE, USER and CONNECTION) are
connector-wide (a single connector instance will use these session ids
for all connections), it is possible to configure the connector to
allow a free string session ID. This string will have to be passed
through the database name as
{db_name}?sessionId={sessionID}
. For example, when
using a database named "test" and a session ID number 1234, the
database name passed to the connector will be:
test?sessionId=1234
With mysql command line utility, the connection command will look like:
mysql -h connectorHost -u user -ppass -P 3306 test?sessionId=1234
JDBC clients will have to pass this session ID with a special tag, as follows:
jdbc:mysql://connector_host:3306/dbname@sessionId=1234?otherJdbcDriverOption=value
In order to use this feature, the special session id PROVIDED_IN_DBNAME needs to be specified at installation time
Also note that a session ID specified in the database name string will override the default provided in the connector configuration file. You can thus have a default session ID set for applications that can't specify it dynamically, and still allow other applications to connect with their own session ID variable.
To enable SmartScale routing, configure the dataservice using the
--connector-smartscale
option. The
session ID identification should also be specified by using the
--connector-smartscale-sessionid
option with one of the following values DATABASE
,
USER
, CONNECTION
or
PROVIDED_IN_DBNAME
:
shell> tpm configure alpha \
... \
--connector-smartscale=true \
--connector-smartscale-sessionid={DATABASE|USER|CONNECTION|PROVIDED_IN_DBNAME}
In this mode, any client application can open a connection to the connector, and queries will automatically be redirected according to the SQL statement content.
In addition, all users that connect to the database must be granted
the REPLICATION CLIENT
privilege so that the user can compare the current replicator progress
for session information. This can be granted using:
mysql> GRANT REPLICATION CLIENT ON *.* to app_user@'%'
To disable SmartScale routing if it has been previously configured:
shell> tpm configure alpha --connector-smartscale=false
When using DATABASE session ID, you can bypass session consistency to read from available Replicas by simply connecting to one database and reading from another. For example:
shell>mysql -u... -p... -P3306 db1
mysql>select * from db2.user
As long as no update is made on db1 in the meantime, the select will be executed on a Replica (if available)