mongodb副本集同步问题【已解决】
发布于 6 年前 作者 lovegnep 7476 次浏览 来自 问答

配置了具有3个节点的副本集,A(主),B(从),C(从), 配置完成后没啥问题,向主节点插入一条数据,在两个从节点中都可以读到。 随后向主节点插入了100万条数据,插入完成后,查看两个从节点,发现只同步了部分数据,B中大概有19万,C中大概有7000多。 等了几个时间后再次查看同步状态,从节点中的数据量还是那么多(B中大概有19万,C中大概有7000多)。 不知道为啥。 一些数据如下

wcgroup:PRIMARY> rs.printReplicationInfo()
configured oplog size:   1640.357421875MB
log length start to end: 25613secs (7.11hrs)
oplog first event time:  Thu May 10 2018 09:18:48 GMT+0800 (CST)
oplog last event time:   Thu May 10 2018 16:25:41 GMT+0800 (CST)
now:                     Thu May 10 2018 16:25:46 GMT+0800 (CST)
wcgroup:PRIMARY> rs.printSlaveReplicationInfo()
source: 47.105.36.1:20005
	syncedTo: Thu May 10 2018 11:10:24 GMT+0800 (CST)
	18917 secs (5.25 hrs) behind the primary 
source: 39.108.56.116:20005
	syncedTo: Thu May 10 2018 11:53:58 GMT+0800 (CST)
	16303 secs (4.53 hrs) behind the primary 
wcgroup:PRIMARY>
wcgroup:PRIMARY> rs.status()
{
	"set" : "wcgroup",
	"date" : ISODate("2018-05-10T08:24:21.745Z"),
	"myState" : 1,
	"term" : NumberLong(9),
	"heartbeatIntervalMillis" : NumberLong(2000),
	"optimes" : {
		"lastCommittedOpTime" : {
			"ts" : Timestamp(0, 0),
			"t" : NumberLong(-1)
		},
		"appliedOpTime" : {
			"ts" : Timestamp(1525940661, 1),
			"t" : NumberLong(9)
		},
		"durableOpTime" : {
			"ts" : Timestamp(1525940661, 1),
			"t" : NumberLong(9)
		}
	},
	"members" : [
		{
			"_id" : 0,
			"name" : "47.98.136.138:20005",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 1596,
			"optime" : {
				"ts" : Timestamp(1525940661, 1),
				"t" : NumberLong(9)
			},
			"optimeDate" : ISODate("2018-05-10T08:24:21Z"),
			"electionTime" : Timestamp(1525939200, 1),
			"electionDate" : ISODate("2018-05-10T08:00:00Z"),
			"configVersion" : 1,
			"self" : true
		},
		{
			"_id" : 1,
			"name" : "47.105.36.1:20005",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 1440,
			"optime" : {
				"ts" : Timestamp(1525921824, 24),
				"t" : NumberLong(2)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1525921824, 24),
				"t" : NumberLong(2)
			},
			"optimeDate" : ISODate("2018-05-10T03:10:24Z"),
			"optimeDurableDate" : ISODate("2018-05-10T03:10:24Z"),
			"lastHeartbeat" : ISODate("2018-05-10T08:24:20.694Z"),
			"lastHeartbeatRecv" : ISODate("2018-05-10T08:24:20.248Z"),
			"pingMs" : NumberLong(27),
			"syncingTo" : "47.98.136.138:20005",
			"configVersion" : 1
		},
		{
			"_id" : 2,
			"name" : "39.108.56.116:20005",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 1467,
			"optime" : {
				"ts" : Timestamp(1525924438, 133),
				"t" : NumberLong(4)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1525924438, 133),
				"t" : NumberLong(4)
			},
			"optimeDate" : ISODate("2018-05-10T03:53:58Z"),
			"optimeDurableDate" : ISODate("2018-05-10T03:53:58Z"),
			"lastHeartbeat" : ISODate("2018-05-10T08:24:21.003Z"),
			"lastHeartbeatRecv" : ISODate("2018-05-10T08:24:21.230Z"),
			"pingMs" : NumberLong(25),
			"syncingTo" : "47.98.136.138:20005",
			"configVersion" : 1
		}
	],
	"ok" : 1,
	"operationTime" : Timestamp(1525940661, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1525940661, 1),
		"signature" : {
			"hash" : BinData(0,"H2lc5RSgUsrpU6CpFwuWa9pTRyk="),
			"keyId" : NumberLong("6553755622771261441")
		}
	}
}
wcgroup:PRIMARY>

下面是从节点B的log

term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:02:06.937+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:02:07.074+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 137ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:02:14.855+0800 I NETWORK  [listener] connection accepted from 172.18.223.34:50694 #6 (4 connections now open)
2018-05-10T16:02:14.855+0800 I NETWORK  [conn6] received client metadata from 172.18.223.34:50694 conn6: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.6.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-05-10T16:02:14.857+0800 I ACCESS   [conn6] Unauthorized: not authorized on admin to execute command { getLog: "startupWarnings", $clusterTime: { clusterTime: Timestamp(1525939331, 1), signature: { hash: BinData(0, 39355313E0D4149A63F1E692FCFB9ADF28A8819E), keyId: 6553755622771261441 } }, $db: "admin" }
2018-05-10T16:02:14.859+0800 I ACCESS   [conn6] Unauthorized: not authorized on admin to execute command { replSetGetStatus: 1.0, forShell: 1.0, $clusterTime: { clusterTime: Timestamp(1525939331, 1), signature: { hash: BinData(0, 39355313E0D4149A63F1E692FCFB9ADF28A8819E), keyId: 6553755622771261441 } }, $db: "admin" }
2018-05-10T16:03:10.942+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:03:10.942+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 1
2018-05-10T16:03:10.942+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 580 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:03:12.428+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:03:12.576+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 148ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:03:27.074+0800 I ACCESS   [conn6] Successfully authenticated as principal lovegnep on admin
2018-05-10T16:03:44.053+0800 I ACCESS   [conn6] Successfully authenticated as principal lovegnep_wcgroup on wcgroup
2018-05-10T16:04:15.942+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:04:15.942+0800 I REPL     [replication-0] Error returned from oplog query (no more query restarts left): NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out
2018-05-10T16:04:15.942+0800 W REPL     [rsBackgroundSync] Fetcher stopped querying remote oplog with error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out
2018-05-10T16:04:15.942+0800 I REPL     [rsBackgroundSync] sync source candidate: 47.98.136.138:20005
2018-05-10T16:04:16.001+0800 I REPL     [rsBackgroundSync] Chose same sync source candidate as last time, 47.98.136.138:20005. Sleeping for 1 second to avoid immediately choosing a new sync source for the same reason as last time.
2018-05-10T16:04:17.002+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:04:17.139+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 137ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:04:53.533+0800 I NETWORK  [thread7] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:04:53.538+0800 I NETWORK  [listener] connection accepted from 39.108.56.116:38170 #7 (5 connections now open)
2018-05-10T16:04:53.539+0800 I NETWORK  [conn7] received client metadata from 39.108.56.116:38170 conn7: { driver: { name: "MongoDB Internal Client", version: "3.6.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-05-10T16:04:53.540+0800 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Successfully connected to 39.108.56.116:20005 (1 connections now open to 39.108.56.116:20005 with a 5 second timeout)
2018-05-10T16:04:53.586+0800 I NETWORK  [thread7] Successfully connected to 47.98.136.138:20005 (1 connections now open to 47.98.136.138:20005 with a 5 second timeout)
2018-05-10T16:04:53.644+0800 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Successfully connected to 47.105.36.1:20005 (1 connections now open to 47.105.36.1:20005 with a 5 second timeout)
2018-05-10T16:04:53.664+0800 I NETWORK  [thread7] Successfully connected to 47.98.136.138:20005 (1 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:04:53.743+0800 I NETWORK  [thread7] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:04:53.743+0800 I NETWORK  [thread7] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:04:53.850+0800 I NETWORK  [thread7] Successfully connected to 47.98.136.138:20005 (2 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:04:53.933+0800 I NETWORK  [thread7] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:04:53.933+0800 I NETWORK  [thread7] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:04:54.037+0800 I NETWORK  [thread7] Successfully connected to 47.98.136.138:20005 (3 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:04:54.115+0800 I NETWORK  [thread7] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:04:54.115+0800 I NETWORK  [thread7] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:04:54.221+0800 I NETWORK  [thread7] Successfully connected to 47.98.136.138:20005 (4 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:04:54.307+0800 I NETWORK  [thread7] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:05:20.792+0800 I NETWORK  [listener] connection accepted from 47.105.36.1:58030 #8 (6 connections now open)
2018-05-10T16:05:20.792+0800 I NETWORK  [conn8] received client metadata from 47.105.36.1:58030 conn8: { driver: { name: "MongoDB Internal Client", version: "3.6.4" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "16.04" } }
2018-05-10T16:05:22.002+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:05:22.002+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 3
2018-05-10T16:05:22.002+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 917 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:05:22.592+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:05:22.728+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 136ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:06:27.003+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 2
2018-05-10T16:06:27.003+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:06:27.003+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 1077 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:06:28.082+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:06:28.214+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 132ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:07:32.003+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:07:32.003+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 1
2018-05-10T16:07:32.003+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 1235 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:07:33.558+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:07:33.686+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 128ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:08:37.003+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:08:37.003+0800 I REPL     [replication-0] Error returned from oplog query (no more query restarts left): NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out
2018-05-10T16:08:37.003+0800 W REPL     [rsBackgroundSync] Fetcher stopped querying remote oplog with error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out
2018-05-10T16:08:37.003+0800 I REPL     [rsBackgroundSync] sync source candidate: 47.98.136.138:20005
2018-05-10T16:08:37.054+0800 I REPL     [rsBackgroundSync] Chose same sync source candidate as last time, 47.98.136.138:20005. Sleeping for 1 second to avoid immediately choosing a new sync source for the same reason as last time.
2018-05-10T16:08:38.054+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:08:38.193+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 139ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:09:43.054+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:09:43.054+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 3
2018-05-10T16:09:43.055+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 1561 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:09:43.396+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:09:43.534+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 138ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:09:53.533+0800 I NETWORK  [thread10] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:09:53.614+0800 I NETWORK  [thread10] Successfully connected to 47.98.136.138:20005 (5 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:09:53.696+0800 I NETWORK  [thread10] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:09:53.696+0800 I NETWORK  [thread10] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:09:53.802+0800 I NETWORK  [thread10] Successfully connected to 47.98.136.138:20005 (6 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:09:53.883+0800 I NETWORK  [thread10] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:09:53.883+0800 I NETWORK  [thread10] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:09:53.992+0800 I NETWORK  [thread10] Successfully connected to 47.98.136.138:20005 (7 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:09:54.076+0800 I NETWORK  [thread10] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:09:54.076+0800 I NETWORK  [thread10] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:09:54.178+0800 I NETWORK  [thread10] Successfully connected to 47.98.136.138:20005 (8 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:09:54.258+0800 I NETWORK  [thread10] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:10:48.055+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:10:48.055+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 2
2018-05-10T16:10:48.055+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 1733 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:10:48.888+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:10:49.030+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 142ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:11:53.055+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:11:53.056+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 1
2018-05-10T16:11:53.057+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 1893 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:11:54.396+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:11:54.531+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 135ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:12:58.057+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:12:58.057+0800 I REPL     [replication-0] Error returned from oplog query (no more query restarts left): NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out
2018-05-10T16:12:58.057+0800 W REPL     [rsBackgroundSync] Fetcher stopped querying remote oplog with error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out
2018-05-10T16:12:58.057+0800 I REPL     [rsBackgroundSync] sync source candidate: 47.98.136.138:20005
2018-05-10T16:12:58.110+0800 I REPL     [rsBackgroundSync] Chose same sync source candidate as last time, 47.98.136.138:20005. Sleeping for 1 second to avoid immediately choosing a new sync source for the same reason as last time.
2018-05-10T16:12:59.111+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:12:59.254+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 143ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:14:04.111+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:14:04.111+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 3
2018-05-10T16:14:04.112+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 2217 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:14:04.466+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:14:04.599+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 133ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:14:53.533+0800 I NETWORK  [thread7] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:14:53.844+0800 I NETWORK  [thread7] Successfully connected to 47.98.136.138:20005 (9 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:14:53.928+0800 I NETWORK  [thread7] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:14:53.928+0800 I NETWORK  [thread7] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:14:54.017+0800 I NETWORK  [thread7] Successfully connected to 47.98.136.138:20005 (10 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:14:54.106+0800 I NETWORK  [thread7] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:14:54.106+0800 I NETWORK  [thread7] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:14:54.207+0800 I NETWORK  [thread7] Successfully connected to 47.98.136.138:20005 (11 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:14:54.286+0800 I NETWORK  [thread7] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:14:54.286+0800 I NETWORK  [thread7] Starting new replica set monitor for wcgroup/39.108.56.116:20005,47.105.36.1:20005,47.98.136.138:20005
2018-05-10T16:14:54.390+0800 I NETWORK  [thread7] Successfully connected to 47.98.136.138:20005 (12 connections now open to 47.98.136.138:20005 with a 0 second timeout)
2018-05-10T16:14:54.473+0800 I NETWORK  [thread7] scoped connection to 47.98.136.138:20005 not being returned to the pool
2018-05-10T16:15:09.112+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:15:09.112+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 2
2018-05-10T16:15:09.112+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 2387 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:15:09.944+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:15:10.324+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 380ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:16:14.112+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:16:14.112+0800 I REPL     [replication-0] Restarting oplog query due to error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out. Last fetched optime (with hash): { ts: Timestamp(1525924438, 133), t: 4 }[4350505511029906934]. Restarts remaining: 1
2018-05-10T16:16:14.112+0800 I REPL     [replication-0] Scheduled new oplog query Fetcher source: 47.98.136.138:20005 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } query metadata: { $replData: 1, $oplogQueryData: 1, $readPreference: { mode: "secondaryPreferred" } } active: 1 findNetworkTimeout: 65000ms getMoreNetworkTimeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 2548 -- target:47.98.136.138:20005 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp(1525924438, 133) } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, batchSize: 13981010, term: 9 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-05-10T16:16:15.708+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:16:15.844+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 136ms (2 connections now open to 47.98.136.138:20005)
2018-05-10T16:17:19.112+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Ending connection to host 47.98.136.138:20005 due to bad connection status; 1 connections to that host remain open
2018-05-10T16:17:19.113+0800 I REPL     [replication-0] Error returned from oplog query (no more query restarts left): NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out
2018-05-10T16:17:19.113+0800 W REPL     [rsBackgroundSync] Fetcher stopped querying remote oplog with error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out
2018-05-10T16:17:19.113+0800 I REPL     [rsBackgroundSync] sync source candidate: 47.98.136.138:20005
2018-05-10T16:17:19.167+0800 I REPL     [rsBackgroundSync] Chose same sync source candidate as last time, 47.98.136.138:20005. Sleeping for 1 second to avoid immediately choosing a new sync source for the same reason as last time.
2018-05-10T16:17:20.167+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Connecting to 47.98.136.138:20005
2018-05-10T16:17:20.313+0800 I ASIO     [NetworkInterfaceASIO-RS-0] Successfully connected to 47.98.136.138:20005, took 146ms (2 connections now open to 47.98.136.138:20005)
4 回复

有人遇到过这种问题没?

来人关注下呀

来自酷炫的 CNodeMD

@lovegnep 嗨,哥们最终你的这个问题是如何解决的呢?劳烦赐教,我们现在也碰到跟你一样的问题。 我们参考这个链接 所讲这个应该是v3.6.0的bug,在v3.6.4,v3.7.1版本分别已修复。 但我们目前线上版本是:MongoDB shell version v3.6.6 MongoDB server version: 3.6.4,依旧会复现该问题。

@luochenxi 忘了有没有解决了。

回到顶部