The fan-in topology is the logical opposite of a Primary/Replica topology. In a fan-in topology, the data from two Sources is combined together on one Target. Fan-in topologies are often in situations where you have satellite databases, maybe for sales or retail operations, and need to combine that information together in a single database for processing.
Some additional considerations need to be made when using fan-in topologies:
If the same tables from each each machine are being merged together, it is possible to get collisions in the data where auto increment is used. The effects can be minimized by using increment offsets within the MySQL configuration:
auto-increment-offset = 1 auto-increment-increment = 4
Fan-in can work more effectively, and be less prone to problems with the
corresponding data by configuring specific tables at different sites.
For example, with two sites in New York and San Jose databases and
tables can be prefixed with the site name, i.e.
Alternatively, a filter can be configured to rename the database
sales dynamically to the
corresponding location based tables. See
Section 9.4.30, “Rename Filter” for more information.
Statement-based replication will work for most instances, but where your
statements are updating data dynamically within the statement, in fan-in
the information may get increased according to the name of fan-in
Sources. Update your configuration file to explicitly use row-based
replication by adding the following to your
binlog-format = row
Triggers can cause problems during fan-in replication if two different statements from each Source and replicated to the Target and cause the operations to be triggered multiple times. Tungsten Replicator cannot prevent triggers from executing on the concentrator host and there is no way to selectively disable triggers. Check at the trigger level whether you are executing on a Source or Target. For more information, see Section C.3.1, “Triggers”.
To create the configuration the Extractors and services must be specified, the topology specification takes care of the actual configuration:
./tools/tpm configure epsilon \ --topology=fan-in \ --install-directory=/opt/continuent \ --replication-user=tungsten \ --replication-password=password \ --master=host1,host2 \ --members=host1,host2,host3 \ --master-services=alpha,beta \ --start-and-report=true
[epsilon] topology=fan-in install-directory=/opt/continuent replication-user=tungsten replication-password=password master=host1,host2 members=host1,host2,host3 master-services=alpha,beta start-and-report=true
The description of each of the options is shown below; click the icon to hide this detail:
Click the icon to show a detailed description of each argument.
Replication topology for the dataservice. Valid values are star,cluster-slave,master-slave,fan-in,clustered,cluster-alias,all-masters,direct
Path to the directory where the active deployment will be installed. The configured directory will contain the software, THL and relay log information unless configured otherwise.
For databases that required authentication, the username to use when connecting to the database using the corresponding connection method (native, JDBC, etc.).
The password to be used when connecting to the database using
The hostname of the Primary (extractor) within the current service. If the current host does not match this specification, then the deployment willby default be configured as a Primary/extractor.
Hostnames for the dataservice members
Data service names that should be used on each Primary
Start the services and report out the status after configuration
If the installation process fails, check the output of the
/tmp/tungsten-configure.log file for
more information about the root cause.
Once the installation has been completed, the service will be started and ready to use.