Eric Lee / smarc-fsl-linux-kernel

19 Oct, 2018

2 commits

5ebaf80bc md-cluster: introduce resync_info_get interface for sanity check ... Browse Code »

Since the resync region from suspend_info means one node
is reshaping this area, so the position of reshape_progress
should be included in the area.

Reviewed-by: NeilBrown
Signed-off-by: Guoqing Jiang
Signed-off-by: Shaohua Li

Guoqing Jiang
2018-10-19 00:36:35 +0800
afd756286 md-cluster/raid10: resize all the bitmaps before start reshape ... Browse Code »

To support add disk under grow mode, we need to resize
all the bitmaps of each node before reshape, so that we
can ensure all nodes have the same view of the bitmap of
the clustered raid.

So after the master node resized the bitmap, it broadcast
a message to other slave nodes, and it checks the size of
each bitmap are same or not by compare pages. We can only
continue the reshaping after all nodes update the bitmap
to the same size (by checking the pages), otherwise revert
bitmap size to previous value.

The resize_bitmaps interface and BITMAP_RESIZE message are
introduced in md-cluster.c for the purpose.

Reviewed-by: NeilBrown
Signed-off-by: Guoqing Jiang
Signed-off-by: Shaohua Li

Guoqing Jiang
2018-10-19 00:30:58 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

17 Mar, 2017

1 commit

818da59f9 md-cluster: add the support for resize ... Browse Code »

To update size for cluster raid, we need to make
sure all nodes can perform the change successfully.
However, it is possible that some of them can't do
it due to failure (bitmap_resize could fail). So
we need to consider the issue before we set the
capacity unconditionally, and we use below steps
to perform sanity check.

1. A change the size, then broadcast METADATA_UPDATED
msg.
2. B and C receive METADATA_UPDATED change the size
excepts call set_capacity, sync_size is not update
if the change failed. Also call bitmap_update_sb
to sync sb to disk.
3. A checks other node's sync_size, if sync_size has
been updated in all nodes, then send CHANGE_CAPACITY
msg otherwise send msg to revert previous change.
4. B and C call set_capacity if receive CHANGE_CAPACITY
msg, otherwise pers->resize will be called to restore
the old value.

Reviewed-by: NeilBrown
Signed-off-by: Guoqing Jiang
Signed-off-by: Shaohua Li

Guoqing Jiang
2017-03-17 07:55:50 +0800

10 May, 2016

1 commit

51e453aec md-cluster: gather resync infos and enable recv_thread after bitmap is ready ... Browse Code »

The in-memory bitmap is not ready when node joins cluster,
so it doesn't make sense to make gather_all_resync_info()
called so earlier, we need to call it after the node's
bitmap is setup. Also, recv_thread could be wake up after
node joins cluster, but it could cause problem if node
receives RESYNCING message without persionality since
mddev->pers->quiesce is called in process_suspend_info.

This commit introduces a new cluster interface load_bitmaps
to fix above problems, load_bitmaps is called in bitmap_load
where bitmap and persionality are ready, and load_bitmaps
does the following tasks:

1. call gather_all_resync_info to load all the node's
bitmap info.
2. set MD_CLUSTER_ALREADY_IN_CLUSTER bit to recv_thread
could be wake up, and wake up recv_thread if there is
pending recv event.

Then ack_bast only wakes up recv_thread after IN_CLUSTER
bit is ready otherwise MD_CLUSTER_PENDING_RESYNC_EVENT is
set.

Reviewed-by: NeilBrown
Signed-off-by: Guoqing Jiang
Signed-off-by: Shaohua Li

Guoqing Jiang
2016-05-10 00:24:03 +0800

06 Jan, 2016

1 commit

f6a2dc64e md-cluster: append some actions when change bitmap from clustered to none ... Browse Code »

For clustered raid, we need to do extra actions when change
bitmap to none.

1. check if all the bitmap lock could be get or not, if yes then
we can continue the change since cluster raid is only active
in current node. Otherwise return fail and unlock the related
bitmap locks
2. set nodes to 0 and then leave cluster environment.
3. release other nodes's bitmap lock.

Signed-off-by: Guoqing Jiang
Signed-off-by: NeilBrown

Guoqing Jiang
2016-01-06 08:38:57 +0800

12 Oct, 2015

3 commits

dbb64f863 md-cluster: Fix adding of new disk with new reload code ... Browse Code »

Adding the disk worked incorrectly with the new reload code. Fix it:

- No operation should be performed on rdev marked as Candidate
- After a metadata update operation, kick disk if role is 0xfffe
else clear Candidate bit and continue with the regular change check.
- Saving the mode of the lock resource to check if token lock is already
locked, because it can be called twice while adding a disk. However,
unlock_comm() must be called only once.
- add_new_disk() is called by the node initiating the --add operation.
If it needs to be canceled, call add_new_disk_cancel(). The operation
is completed by md_update_sb() which will write and unlock the
communication.

Signed-off-by: Goldwyn Rodrigues

Goldwyn Rodrigues
2015-10-12 16:35:30 +0800
c186b128c md-cluster: Perform resync/recovery under a DLM lock ... Browse Code »

Resync or recovery must be performed by only one node at a time.
A DLM lock resource, resync_lockres provides the mutual exclusion
so that only one node performs the recovery/resync at a time.

If a node is unable to get the resync_lockres, because recovery is
being performed by another node, it set MD_RECOVER_NEEDED so as
to schedule recovery in the future.

Remove the debug message in resync_info_update()
used during development.

Signed-off-by: Goldwyn Rodrigues

Goldwyn Rodrigues
2015-10-12 16:32:44 +0800
c40f341f1 md-cluster: Use a small window for resync ... Browse Code »

Suspending the entire device for resync could take too long. Resync
in small chunks.

cluster's resync window (32M) is maintained in r1conf as
cluster_sync_low and cluster_sync_high and processed in
raid1's sync_request(). If the current resync is outside the cluster
resync window:

1. Set the cluster_sync_low to curr_resync_completed.
2. Check if the sync will fit in the new window, if not issue a
wait_barrier() and set cluster_sync_low to sector_nr.
3. Set cluster_sync_high to cluster_sync_low + resync_window.
4. Send a message to all nodes so they may add it in their suspension
list.

bitmap_cond_end_sync is modified to allow to force a sync inorder
to get the curr_resync_completed uptodate with the sector passed.

Signed-off-by: Goldwyn Rodrigues
Signed-off-by: NeilBrown

Goldwyn Rodrigues
2015-10-12 14:32:05 +0800

24 Jul, 2015

1 commit

90382ed9a Fix read-balancing during node failure ... Browse Code »

During a node failure, We need to suspend read balancing so that the
reads are directed to the first device and stale data is not read.
Suspending writes is not required because these would be recorded and
synced eventually.

A new flag MD_CLUSTER_SUSPEND_READ_BALANCING is set in recover_prep().
area_resyncing() will respond true for the entire devices if this
flag is set and the request type is READ. The flag is cleared
in recover_done().

Signed-off-by: Goldwyn Rodrigues
Reported-By: David Teigland
Signed-off-by: NeilBrown

Goldwyn Rodrigues
2015-07-24 11:37:59 +0800

22 Apr, 2015

2 commits

97f6cd39d md-cluster: re-add capabilities ... Browse Code »

When "re-add" is writted to /sys/block/mdXX/md/dev-YYY/state,
the clustered md:

1. Sends RE_ADD message with the desc_nr. Nodes receiving the message
clear the Faulty bit in their respective rdev->flags.
2. The node initiating re-add, gathers the bitmaps of all nodes
and copies them into the local bitmap. It does not clear the bitmap
from which it is copying.
3. Initiating node schedules a md recovery to sync the devices.

Signed-off-by: Guoqing Jiang
Signed-off-by: Goldwyn Rodrigues
Signed-off-by: NeilBrown

Goldwyn Rodrigues
2015-04-22 05:59:39 +0800
88bcfef7b md-cluster: remove capabilities ... Browse Code »

This adds "remove" capabilities for the clustered environment.
When a user initiates removal of a device from the array, a
REMOVE message with disk number in the array is sent to all
the nodes which kick the respective device in their own array.

This facilitates the removal of failed devices.

Signed-off-by: Goldwyn Rodrigues
Signed-off-by: NeilBrown

Goldwyn Rodrigues
2015-04-22 05:59:39 +0800

21 Mar, 2015

1 commit

fa8259da0 md: Fix stray --cluster-confirm crash ... Browse Code »

A --cluster-confirm without an --add (by another node) can
crash the kernel.

Fix it by guarding it using a state.

Signed-off-by: Goldwyn Rodrigues
Signed-off-by: NeilBrown

Goldwyn Rodrigues
2015-03-21 07:33:00 +0800

23 Feb, 2015

7 commits

1aee41f63 Add new disk to clustered array ... Browse Code »

Algorithm:
1. Node 1 issues mdadm --manage /dev/mdX --add /dev/sdYY which issues
ioctl(ADD_NEW_DISC with disc.state set to MD_DISK_CLUSTER_ADD)
2. Node 1 sends NEWDISK with uuid and slot number
3. Other nodes issue kobject_uevent_env with uuid and slot number
(Steps 4,5 could be a udev rule)
4. In userspace, the node searches for the disk, perhaps
using blkid -t SUB_UUID=""
5. Other nodes issue either of the following depending on whether the disk
was found:
ioctl(ADD_NEW_DISK with disc.state set to MD_DISK_CANDIDATE and
disc.number set to slot number)
ioctl(CLUSTERED_DISK_NACK)
6. Other nodes drop lock on no-new-devs (CR) if device is found
7. Node 1 attempts EX lock on no-new-devs
8. If node 1 gets the lock, it sends METADATA_UPDATED after unmarking the disk
as SpareLocal
9. If not (get no-new-dev lock), it fails the operation and sends METADATA_UPDATED
10. Other nodes understand if the device is added or not by reading the superblock again after receiving the METADATA_UPDATED message.

Signed-off-by: Lidong Zhong
Signed-off-by: Goldwyn Rodrigues

Goldwyn Rodrigues
2015-02-23 23:59:07 +0800
589a1c491 Suspend writes in RAID1 if within range ... Browse Code »

If there is a resync going on, all nodes must suspend writes to the
range. This is recorded in the suspend_info/suspend_list.

If there is an I/O within the ranges of any of the suspend_info,
should_suspend will return 1.

Signed-off-by: Goldwyn Rodrigues

Goldwyn Rodrigues
2015-02-23 23:59:07 +0800
965400eb6 Send RESYNCING while performing resync start/stop ... Browse Code »

When a resync is initiated, RESYNCING message is sent to all active
nodes with the range (lo,hi). When the resync is over, a RESYNCING
message is sent with (0,0). A high sector value of zero indicates
that the resync is over.

Signed-off-by: Goldwyn Rodrigues

Goldwyn Rodrigues
2015-02-23 23:59:06 +0800
293467aa1 metadata_update sends message to other nodes ... Browse Code »

- request to send a message
- make changes to superblock
- send messages telling everyone that the superblock has changed
- other nodes all read the superblock
- other nodes all ack the messages
- updating node release the "I'm sending a message" resource.

Signed-off-by: Goldwyn Rodrigues

Goldwyn Rodrigues
2015-02-23 23:59:06 +0800
96ae923ab Gather on-going resync information of other nodes ... Browse Code »

When a node joins, it does not know of other nodes performing resync.
So, each node keeps the resync information in it's LVB. When a new
node joins, it reads the LVB of each "online" bitmap.

[TODO] The new node attempts to get the PW lock on other bitmap, if
it is successful, it reads the bitmap and performs the resync (if
required) on it's behalf.

If the node does not get the PW, it requests CR and reads the LVB
for the resync information.

Signed-off-by: Goldwyn Rodrigues

Goldwyn Rodrigues
2015-02-23 21:30:11 +0800
cf921cc19 Add node recovery callbacks ... Browse Code »

DLM offers callbacks when a node fails and the lock remastery
is performed:

1. recover_prep: called when DLM discovers a node is down
2. recover_slot: called when DLM identifies the node and recovery
can start
3. recover_done: called when all nodes have completed recover_slot

recover_slot() and recover_done() are also called when the node joins
initially in order to inform the node with its slot number. These slot
numbers start from one, so we deduct one to make it start with zero
which the cluster-md code uses.

Signed-off-by: Goldwyn Rodrigues

Goldwyn Rodrigues
2015-02-23 21:30:11 +0800
edb39c9de Introduce md_cluster_operations to handle cluster functions ... Browse Code »

This allows dynamic registering of cluster hooks.

Signed-off-by: Goldwyn Rodrigues

Goldwyn Rodrigues
2015-02-23 21:28:42 +0800