linux-edison.git
12 years agopinctrl: fix mutex deadlock in get_pinctrl_dev_from_of_node()
Daniel Mack [Fri, 26 Apr 2013 16:57:02 +0000 (18:57 +0200)]
pinctrl: fix mutex deadlock in get_pinctrl_dev_from_of_node()

This obvious bug was introduced by d755910b7 ("pinctrl: move subsystem
mutex to pinctrl_dev struct").

Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
12 years agopinctrl: plgpio: add CONFIG_PM_SLEEP to suspend/resume functions
Jingoo Han [Mon, 25 Mar 2013 09:59:38 +0000 (18:59 +0900)]
pinctrl: plgpio: add CONFIG_PM_SLEEP to suspend/resume functions

Add CONFIG_PM_SLEEP to suspend/resume functions to fix the following
build warning when CONFIG_PM_SLEEP is not selected. This is because
sleep PM callbacks defined by SIMPLE_DEV_PM_OPS are only used when
the CONFIG_PM_SLEEP is enabled.

drivers/pinctrl/spear/pinctrl-plgpio.c:645:12: warning: 'plgpio_suspend' defined but not used [-Wunused-function]
drivers/pinctrl/spear/pinctrl-plgpio.c:684:12: warning: 'plgpio_resume' defined but not used [-Wunused-function]

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
12 years agorelay: move remove_buf_file inside relay_close_buf
Dmitry Monakhov [Mon, 22 Apr 2013 07:41:41 +0000 (11:41 +0400)]
relay: move remove_buf_file inside relay_close_buf

Currently remove_buf_file callback is called from from kobject
release method. This result in follow issue:
# blktrace -d /dev/sda1 -d /dev/sda -o test

blktrace_setup()
 dir = create_dir()
 rchan = relay_open(dir,...)
 ->create_buf_file_callback
    buf_file  = debugfs_create_file(dir, )

Userspace will open buf_file.
Later we make a decision to stop tracing
blktrace_down()
  relay_close(rhcan)  /* just decrement kobj reference  */
                      /* since it is not zero then callback not called */
  debugfs_remove(dir) /* FAIL due to non empty dir   */

Later user space will close the file and file will be deleted,
but directory still exist.
user_space_close()
 ->file_release
   ->release_buf_file_callback
     ->debugfs_remove(buf_file
## TESTCASE:
# blktrace -d /dev/sda1 -d /dev/sda -o test
# After that blktrace infrastructure will remain broken in
# an unusable state so: blktrace -d /dev/sda1 will not work.

In fact this is general issue, blktrace is just one of examples.
We can not reliably remove parent dir until all users close the
buf_file.

Solution: We don't have to wait that long. File should be deleted inside
relay_close_buf().

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
12 years agoMerge tag 'v3.9' into efi-for-tip2
Matt Fleming [Tue, 30 Apr 2013 10:30:24 +0000 (11:30 +0100)]
Merge tag 'v3.9' into efi-for-tip2

Resolve conflicts for Ingo.

Conflicts:
drivers/firmware/Kconfig
drivers/firmware/efivars.c

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
12 years agosudmac: add support for SUDMAC
Shimoda, Yoshihiro [Tue, 23 Apr 2013 11:00:12 +0000 (20:00 +0900)]
sudmac: add support for SUDMAC

Some Renesas USB modules have SUDMAC. This patch supports it using
the shdma-base driver.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
12 years agodma: sh: add Kconfig
Shimoda, Yoshihiro [Tue, 23 Apr 2013 11:00:06 +0000 (20:00 +0900)]
dma: sh: add Kconfig

This patch adds Kconfig in the drivers/dma/sh. This patch also adds
a new config "SH_DMAE_BASE" and the "config SH_DMAE" depends on it.
Since some drivers (e.g. sh_mmcif.c) depends on shdma-base.c if
CONFIG_DMA_ENGINE=y, the "config SH_DMAE_BASE" is set as "bool".

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
12 years agokvm/ppc/mpic: remove default routes from documentation
Scott Wood [Mon, 29 Apr 2013 14:07:48 +0000 (14:07 +0000)]
kvm/ppc/mpic: remove default routes from documentation

The default routes were removed from the code during patchset
respinning, but were not removed from the documentation.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
12 years agoperf/x86/intel: Add support for IvyBridge model 58 Uncore
Vince Weaver [Mon, 29 Apr 2013 19:52:27 +0000 (15:52 -0400)]
perf/x86/intel: Add support for IvyBridge model 58 Uncore

According to Intel Vol3b 18.9, the IvyBridge model 58 uncore is
the same as that of SandyBridge.

I've done some simple tests and with this patch things seem to
work on my mac-mini.

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stephane Eranian <eranian@gmail.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1304291549320.15827@vincent-weaver-1.um.maine.edu
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agoperf/x86/intel: Fix typo in perf_event_intel_uncore.c
Vince Weaver [Mon, 29 Apr 2013 19:49:28 +0000 (15:49 -0400)]
perf/x86/intel: Fix typo in perf_event_intel_uncore.c

Sandy Bridge was misspelled.  Either that or the Intel marketing
names are getting even more obscure.

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1304291546590.15827@vincent-weaver-1.um.maine.edu
[ Haha ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agox86: Eliminate irq_mis_count counted in arch_irq_stat
Li Fei [Fri, 26 Apr 2013 12:50:11 +0000 (20:50 +0800)]
x86: Eliminate irq_mis_count counted in arch_irq_stat

With the current implementation, kstat_cpu(cpu).irqs_sum is also
increased in case of irq_mis_count increment.

So there is no need to count irq_mis_count in arch_irq_stat,
otherwise irq_mis_count will be counted twice in the sum of
/proc/stat.

Reported-by: Liu Chuansheng <chuansheng.liu@intel.com>
Signed-off-by: Li Fei <fei.li@intel.com>
Acked-by: Liu Chuansheng <chuansheng.liu@intel.com>
Cc: tomoki.sekiyama.qu@hitachi.com
Cc: joe@perches.com
Link: http://lkml.kernel.org/r/1366980611.32469.7.camel@fli24-HP-Compaq-8100-Elite-CMT-PC
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agoMerge branch 'rcu/nohz' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck...
Ingo Molnar [Tue, 30 Apr 2013 08:51:23 +0000 (10:51 +0200)]
Merge branch 'rcu/nohz' of git://git./linux/kernel/git/paulmck/linux-rcu into timers/nohz

Pull full dynticks documentation update from Paul McKenney.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agodrm/i915: Always normalize return timeout for wait_timeout_ioctl
Chris Wilson [Fri, 26 Apr 2013 13:22:46 +0000 (16:22 +0300)]
drm/i915: Always normalize return timeout for wait_timeout_ioctl

As we recompute the remaining timeout after waiting, there is a
potential for that timeout to be less than zero and so need sanitizing.
The timeout is always returned to userspace and validated, so we should
always perform the sanitation.

v2 [vsyrjala]: Only normalize the timespec if it's invalid
v3: Add a comment to clarify the situation and remove the now
    useless WARN_ON() (ickle)

Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
12 years agoMerge branch 'rcu/doc' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux...
Ingo Molnar [Tue, 30 Apr 2013 08:49:04 +0000 (10:49 +0200)]
Merge branch 'rcu/doc' of git://git./linux/kernel/git/paulmck/linux-rcu into core/urgent

Pull RCU documentation update for reducing OS jitter due to
per-CPU kthreads, from Paul McKenney.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agoat_hdmac: move to generic DMA binding
Ludovic Desroches [Fri, 19 Apr 2013 09:11:18 +0000 (09:11 +0000)]
at_hdmac: move to generic DMA binding

Update at_hdmac driver to support generic DMA device tree binding. Devices
can still request channel with dma_request_channel() then it doesn't break
DMA for non DT boards.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
12 years agoMerge branches 'for-3.10/wiimote' and 'for-3.9/upstream-fixes' into for-linus
Jiri Kosina [Tue, 30 Apr 2013 08:19:21 +0000 (10:19 +0200)]
Merge branches 'for-3.10/wiimote' and 'for-3.9/upstream-fixes' into for-linus

12 years agoMerge branches 'for-3.10/multitouch', 'for-3.10/roccat' and 'for-3.10/upstream' into...
Jiri Kosina [Tue, 30 Apr 2013 08:19:07 +0000 (10:19 +0200)]
Merge branches 'for-3.10/multitouch', 'for-3.10/roccat' and 'for-3.10/upstream' into for-linus

Conflicts:
drivers/hid/Kconfig

12 years agoMerge branch 'for-3.10/mt-hybrid-finger-pen' into for-linus
Jiri Kosina [Tue, 30 Apr 2013 08:17:48 +0000 (10:17 +0200)]
Merge branch 'for-3.10/mt-hybrid-finger-pen' into for-linus

Conflicts:
drivers/hid/hid-multitouch.c

12 years agoMerge branches 'for-3.10/appleir', 'for-3.10/hid-debug', 'for-3.10/hid-driver-transpo...
Jiri Kosina [Tue, 30 Apr 2013 08:12:44 +0000 (10:12 +0200)]
Merge branches 'for-3.10/appleir', 'for-3.10/hid-debug', 'for-3.10/hid-driver-transport-cleanups', 'for-3.10/i2c-hid' and 'for-3.10/logitech' into for-linus

12 years agoHID: protect hid_debug_list
Jiri Kosina [Tue, 16 Apr 2013 22:40:09 +0000 (15:40 -0700)]
HID: protect hid_debug_list

Accesses to hid_device->hid_debug_list are not serialized properly, which
could result in SMP concurrency issues when HID debugfs events are accessesed
by multiple userspace processess.

Serialize all the list operations by a mutex.

Spotted by Al Viro.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
12 years agoHID: debug: break out hid_dump_report() into hid-debug
Benjamin Tissoires [Wed, 17 Apr 2013 17:38:13 +0000 (19:38 +0200)]
HID: debug: break out hid_dump_report() into hid-debug

No semantic changes, but hid_dump_report should be in hid-debug.c, not
in hid-core.c

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Tue, 30 Apr 2013 07:50:54 +0000 (03:50 -0400)]
Merge git://git./linux/kernel/git/davem/net

Conflicts:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
drivers/net/ethernet/emulex/benet/be.h
include/net/tcp.h
net/mac802154/mac802154.h

Most conflicts were minor overlapping stuff.

The be2net driver brought in some fixes that added __vlan_put_tag
calls, which in net-next take an additional argument.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoALSA: usb-audio: caiaq: fix endianness bug in snd_usb_caiaq_maschine_dispatch
Eldad Zack [Mon, 29 Apr 2013 19:15:46 +0000 (21:15 +0200)]
ALSA: usb-audio: caiaq: fix endianness bug in snd_usb_caiaq_maschine_dispatch

Current code does this:

  be16_to_cpu(buf[i * 2] << 8 | buf[(i * 2) + 1])

Which is effectively (neglecting the index):

  be16_to_cpu(be16_to_cpu(*((u16 *) buf)))

This means the int16 in the buffer is not converted at all.

Daniel Mack confirmed that the driver works on little endian
CPUs, leading to the conclusion that the device-side structure
is actually little endian.
This changes the code to use le16_to_cpu().

Caught by sparse.

Acked-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agopartitions/efi.c: replace useless kzalloc's by kmalloc's
Philippe De Muyter [Mon, 29 Apr 2013 21:00:18 +0000 (23:00 +0200)]
partitions/efi.c: replace useless kzalloc's by kmalloc's

In alloc_read_gpt_entries and alloc_read_gpt_header, the kzalloc'ated
zones are either totally overwritten by the following read_lba call,
or freed.  As kmalloc is cheaper than kzalloc, use kmalloc.

Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: Panagiotis Issaris <takis@issaris.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
12 years agopowerpc: Update tlbie/tlbiel as per ISA doc
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:39 +0000 (09:37 +0000)]
powerpc: Update tlbie/tlbiel as per ISA doc

Encode the actual page correctly in tlbie/tlbiel. This make sure we handle
multiple page size segment correctly.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Print page size info during boot
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:38 +0000 (09:37 +0000)]
powerpc: Print page size info during boot

This gives hint about different base and actual page size combination
supported by the platform.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: print both base and actual page size on hash failure
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:37 +0000 (09:37 +0000)]
powerpc: print both base and actual page size on hash failure

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Fix hpte_decode to use the correct decoding for page sizes
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:36 +0000 (09:37 +0000)]
powerpc: Fix hpte_decode to use the correct decoding for page sizes

As per ISA doc, we encode base and actual page size in the LP bits of
PTE. The number of bit used to encode the page sizes depend on actual
page size.  ISA doc lists this as

   PTE LP     actual page size
rrrr rrrz  >=8KB
rrrr rrzz >=16KB
rrrr rzzz  >=32KB
rrrr zzzz  >=64KB
rrrz zzzz  >=128KB
rrzz zzzz  >=256KB
rzzz zzzz >=512KB
zzzz zzzz  >=1MB

ISA doc also says
"The values of the “z” bits used to specify each size, along with all possible
values of “r” bits in the LP field, must result in LP values distinct from
other LP values for other sizes."

based on the above update hpte_decode to use the correct decoding for LP bits.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Decode the pte-lp-encoding bits correctly.
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:35 +0000 (09:37 +0000)]
powerpc: Decode the pte-lp-encoding bits correctly.

We look at both the segment base page size and actual page size and store
the pte-lp-encodings in an array per base page size.

We also update all relevant functions to take actual page size argument
so that we can use the correct PTE LP encoding in HPTE. This should also
get the basic Multiple Page Size per Segment (MPSS) support. This is needed
to enable THP on ppc64.

[Fixed PR KVM build --BenH]

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Use encode avpn where we need only avpn values
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:34 +0000 (09:37 +0000)]
powerpc: Use encode avpn where we need only avpn values

In all these cases we are doing something similar to

HPTE_V_COMPARE(hpte_v, want_v) which ignores the HPTE_V_LARGE bit

With MPSS support we would need actual page size to set HPTE_V_LARGE
bit and that won't be available in most of these cases. Since we are ignoring
HPTE_V_LARGE bit, use the  avpn value instead. There should not be any change
in behaviour after this patch.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Reduce PTE table memory wastage
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:33 +0000 (09:37 +0000)]
powerpc: Reduce PTE table memory wastage

We allocate one page for the last level of linux page table. With THP and
large page size of 16MB, that would mean we are wasting large part
of that page. To map 16MB area, we only need a PTE space of 2K with 64K
page size. This patch reduce the space wastage by sharing the page
allocated for the last level of linux page table with multiple pmd
entries. We call these smaller chunks PTE page fragments and allocated
page, PTE page.

In order to support systems which doesn't have 64K HPTE support, we also
add another 2K to PTE page fragment. The second half of the PTE fragments
is used for storing slot and secondary bit information of an HPTE. With this
we now have a 4K PTE fragment.

We use a simple approach to share the PTE page. On allocation, we bump the
PTE page refcount to 16 and share the PTE page with the next 16 pte alloc
request. This should help in the node locality of the PTE page fragment,
assuming that the immediate pte alloc request will mostly come from the
same NUMA node. We don't try to reuse the freed PTE page fragment. Hence
we could be waisting some space.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Move the pte free routines from common header
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:32 +0000 (09:37 +0000)]
powerpc: Move the pte free routines from common header

Acked-by: Paul Mackerras <paulus@samba.org>
This patch moves the common code to 32/64 bit headers and also duplicate
4K_PAGES and 64K_PAGES section. We will later change the 64 bit 64K_PAGES
version to support smaller PTE fragments. The patch doesn't introduce
any functional changes.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Reduce the PTE_INDEX_SIZE
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:31 +0000 (09:37 +0000)]
powerpc: Reduce the PTE_INDEX_SIZE

This make one PMD cover 16MB range. That helps in easier implementation of THP
on power. THP core code make use of one pmd entry to track the hugepage and
the range mapped by a single pmd entry should be equal to the hugepage size
supported by the hardware.

This also switch PGD to cover 16GB. That is needed so that we can simplify the
hugetlb page walking code so that we have same pte format for explicit hugepage
and THP hugepage.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Switch 16GB and 16MB explicit hugepages to a different page table format
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:30 +0000 (09:37 +0000)]
powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format

We will be switching PMD_SHIFT to 24 bits to facilitate THP impmenetation.
With PMD_SHIFT set to 24, we now have 16MB huge pages allocated at PGD level.
That means with 32 bit process we cannot allocate normal pages at
all, because we cover the entire address space with one pgd entry. Fix this
by switching to a new page table format for hugepages. With the new page table
format for 16GB and 16MB hugepages we won't allocate hugepage directory. Instead
we encode the PTE information directly at the directory level. This forces 16MB
hugepage at PMD level. This will also make the page take walk much simpler later
when we add the THP support.

With the new table format we have 4 cases for pgds and pmds:
(1) invalid (all zeroes)
(2) pointer to next table, as normal; bottom 6 bits == 0
(3) leaf pte for huge page, bottom two bits != 00
(4) hugepd pointer, bottom two bits == 00, next 4 bits indicate size of table

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: New hugepage directory format
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:29 +0000 (09:37 +0000)]
powerpc: New hugepage directory format

Change the hugepage directory format so that we can have leaf ptes directly
at page directory avoiding the allocation of hugepage directory.

With the new table format we have 3 cases for pgds and pmds:
(1) invalid (all zeroes)
(2) pointer to next table, as normal; bottom 6 bits == 0
(4) hugepd pointer, bottom two bits == 00, next 4 bits indicate size of table

Instead of storing shift value in hugepd pointer we use mmu_psize_def index
so that we can fit all the supported hugepage size in 4 bits

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Don't truncate pgd_index wrongly
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:28 +0000 (09:37 +0000)]
powerpc: Don't truncate pgd_index wrongly

With PGD_INDEX_SIZE set to 12 the existing macro doesn't work. Fix it to
use PTRS_PER_PGD

The idea originally was to have one more bit in the result of
pgd_index() than PGD_INDEX_SIZE, so that if one had an address
corresponding to the last PGD entry, and then incremented that address
by PGD_SIZE, and took pgd_index() of that, you wouldn't end up with
zero.  The commit that introduced that dates back to 2002, and the
code that was sensitive to that edge case has long since been
refactored (several times), so there is no need for it these days.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Don't hard code the size of pte page
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:27 +0000 (09:37 +0000)]
powerpc: Don't hard code the size of pte page

USE PTRS_PER_PTE to indicate the size of pte page. To support THP,
later patches will be changing PTRS_PER_PTE value.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Save DAR and DSISR in pt_regs on MCE
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:26 +0000 (09:37 +0000)]
powerpc: Save DAR and DSISR in pt_regs on MCE

We were not saving DAR and DSISR on MCE. Save then and also print the values
along with exception details in xmon.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Use signed formatting when printing error
Aneesh Kumar K.V [Sun, 28 Apr 2013 09:37:25 +0000 (09:37 +0000)]
powerpc: Use signed formatting when printing error

PAPR defines these errors as negative values. So print them accordingly
for easy debugging.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc/pseries: Correct builds break when CONFIG_SMP not defined
Nathan Fontenot [Mon, 29 Apr 2013 03:45:36 +0000 (03:45 +0000)]
powerpc/pseries: Correct builds break when CONFIG_SMP not defined

Correct build failure for powerpc/pseries builds with CONFIG_SMP not defined.

The function cpu_sibling_mask has no meaning (or definition) when CONFIG_SMP
is not defined. Additionally, the updating of NUMA affinity for a CPU in a UP
system doesn't really make sense.

This patch ifdef's out the code making the affinity updates for PRRN events to
fix the following build break.

arch/powerpc/mm/numa.c: In function ‘stage_topology_update’:
arch/powerpc/mm/numa.c:1535: error: implicit declaration of function ‘cpu_sibling_mask’
arch/powerpc/mm/numa.c:1535: warning: passing argument 3 of ‘cpumask_or’ makes pointer from integer without a cast
make[1]: *** [arch/powerpc/mm/numa.o] Error 1

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc/booke: Remove obsolete macro FINISH_EXCEPTION
Kevin Hao [Mon, 29 Apr 2013 00:59:57 +0000 (00:59 +0000)]
powerpc/booke: Remove obsolete macro FINISH_EXCEPTION

This is stale and not used by anyone now.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc/rtas_flash: Fix bad memory access
Vasant Hegde [Sun, 28 Apr 2013 18:43:56 +0000 (18:43 +0000)]
powerpc/rtas_flash: Fix bad memory access

We use kmem_cache_alloc() to allocate memory to hold the new firmware
which will be flashed. kmem_cache_alloc() calls rtas_block_ctor() to
set memory to NULL. But these constructor is called only for newly
allocated slabs.

If we run below command multiple time without rebooting, allocator may
allocate memory from the area which was free'd by kmem_cache_free and
it will not call constructor. In this situation we may hit kernel oops.

dd if=<fw image> of=/proc/ppc64/rtas/firmware_flash bs=4096

oops message:
-------------
[ 1602.399755] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1602.399772] SMP NR_CPUS=1024 NUMA pSeries
[ 1602.399779] Modules linked in: rtas_flash nfsd lockd auth_rpcgss nfs_acl sunrpc fuse loop dm_mod sg ipv6 ses enclosure ehea ehci_pci ohci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw scsi_dh_rdac scsi_dh ipr libata scsi_mod
[ 1602.399817] NIP: d00000000a170b9c LR: d00000000a170b64 CTR: c00000000079cd58
[ 1602.399823] REGS: c0000003b9937930 TRAP: 0300   Not tainted  (3.9.0-rc4-0.27-ppc64)
[ 1602.399828] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 22000428  XER: 20000000
[ 1602.399841] SOFTE: 1
[ 1602.399844] CFAR: c000000000005f24
[ 1602.399848] DAR: 8c2625a820631fef, DSISR: 40000000
[ 1602.399852] TASK = c0000003b4520760[3655] 'dd' THREAD: c0000003b9934000 CPU: 3
GPR00: 8c2625a820631fe7 c0000003b9937bb0 d00000000a179f28 d00000000a171f08
GPR04: 0000000010040000 0000000000001000 c0000003b9937df0 c0000003b5fb2080
GPR08: c0000003b58f7200 d00000000a179f28 c0000003b40058d4 c00000000079cd58
GPR12: d00000000a171450 c000000007f40900 0000000000000005 0000000010178d20
GPR16: 00000000100cb9d8 000000000000001d 0000000000000000 000000001003ffff
GPR20: 0000000000000001 0000000000000000 00003fffa0b50d30 000000001001f010
GPR24: 0000000010020888 0000000010040000 d00000000a171f08 d00000000a172808
GPR28: 0000000000001000 0000000010040000 c0000003b4005880 8c2625a820631fe7
[ 1602.399924] NIP [d00000000a170b9c] .rtas_flash_write+0x7c/0x1e8 [rtas_flash]
[ 1602.399930] LR [d00000000a170b64] .rtas_flash_write+0x44/0x1e8 [rtas_flash]
[ 1602.399934] Call Trace:
[ 1602.399939] [c0000003b9937bb0] [d00000000a170b64] .rtas_flash_write+0x44/0x1e8 [rtas_flash] (unreliable)
[ 1602.399948] [c0000003b9937c60] [c000000000282830] .proc_reg_write+0x90/0xe0
[ 1602.399955] [c0000003b9937ce0] [c0000000001ff374] .vfs_write+0x114/0x238
[ 1602.399961] [c0000003b9937d80] [c0000000001ff5d8] .SyS_write+0x70/0xe8
[ 1602.399968] [c0000003b9937e30] [c000000000009cdc] syscall_exit+0x0/0xa0
[ 1602.399973] Instruction dump:
[ 1602.399977] eb698010 801b0028 2f80dcd6 419e00a4 2fbc0000 419e009c ebfb0030 2fbf0000
[ 1602.399989] 409e0010 480000d8 60000000 7c1f0378 <e81f00082fa00000 409efff4 e81f0000
[ 1602.400012] ---[ end trace b4136d115dc31dac ]---
[ 1602.402178]
[ 1602.402185] Sending IPI to other CPUs
[ 1602.403329] IPI complete

This patch uses kmem_cache_zalloc() instead of kmem_cache_alloc() to
allocate memory, which makes sure memory is set to 0 before using.
Also removes rtas_block_ctor(), which is no longer required.

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Fix build failure after merge of the cgroup tree
Stephen Rothwell [Sun, 28 Apr 2013 18:04:33 +0000 (18:04 +0000)]
powerpc: Fix build failure after merge of the cgroup tree

After merging the cgroup tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

arch/powerpc/mm/numa.c: In function 'arch_update_cpu_topology':
arch/powerpc/mm/numa.c:1465:2: error: implicit declaration of function 'kzalloc' [-Werror=implicit-function-declaration]
arch/powerpc/mm/numa.c:1465:10: error: assignment makes pointer from integer without a cast [-Werror]
arch/powerpc/mm/numa.c:1497:2: error: implicit declaration of function 'kfree' [-Werror=implicit-function-declaration]

Caused by commit 30c05350c39d ("powerpc/pseries: Use stop machine to
update cpu maps") from the powerpc tree interacting with (probably)
commit ff794dea52ea ("cpuset: remove include of cgroup.h from cpuset.h")
from the cgroup tree.  Removing includes from header files is fraught
with danger ...

The former should have added an include of linux/slab.h to
arch/powerpc/mm/numa.c.

I have added the following merge fix patch for today (but it should be
applied to the powerpc tree ASAP).

From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Mon, 29 Apr 2013 14:01:44 +1000
Subject: [PATCH] powerpc: numa.c: using kzalloc/kfree requires including
 slab.h

fixes these build errors:

arch/powerpc/mm/numa.c: In function 'arch_update_cpu_topology':
arch/powerpc/mm/numa.c:1465:2: error: implicit declaration of function 'kzalloc' [-Werror=implicit-function-declaration]
arch/powerpc/mm/numa.c:1465:10: error: assignment makes pointer from integer without a cast [-Werror]
arch/powerpc/mm/numa.c:1497:2: error: implicit declaration of function 'kfree' [-Werror=implicit-function-declaration]

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agopowerpc: Fix usage of setup_pci_atmu()
Michael Neuling [Sun, 14 Apr 2013 19:42:01 +0000 (19:42 +0000)]
powerpc: Fix usage of setup_pci_atmu()

Linux next is currently failing to compile mpc85xx_defconfig with:
  arch/powerpc/sysdev/fsl_pci.c:944:2: error: too many arguments to function 'setup_pci_atmu'

This is caused by (from Kumar's next branch):
  commit 34642bbb3d12121333efcf4ea7dfe66685e403a1
  Author: Kumar Gala <galak@kernel.crashing.org>
  powerpc/fsl-pci: Keep PCI SoC controller registers in pci_controller

Which changed definition of setup_pci_atmu() but didn't update one of
the callers.  Below fixes this.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agoMD: ignore discard request for hard disks of hybid raid1/raid10 array
Shaohua Li [Sun, 28 Apr 2013 10:26:38 +0000 (18:26 +0800)]
MD: ignore discard request for hard disks of hybid raid1/raid10 array

In SSD/hard disk hybid storage, discard request should be ignored for hard
disk. We used to be doing this way, but the unplug path forgets it.

This is suitable for stable tree since v3.6.

Cc: stable@vger.kernel.org
Reported-and-tested-by: Markus <M4rkusXXL@web.de>
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd: bad block list should default to disabled.
NeilBrown [Wed, 24 Apr 2013 01:42:44 +0000 (11:42 +1000)]
md: bad block list should default to disabled.

Maintenance of a bad-block-list currently defaults to 'enabled'
and is then disabled when it cannot be supported.
This is backwards and causes problem for dm-raid which didn't know
to disable it.

So fix the defaults, and only enabled for v1.x metadata which
explicitly has bad blocks enabled.

The problem with dm-raid has been present since badblock support was
added in v3.1, so this patch is suitable for any -stable from 3.1
onwards.

Cc: stable@vger.kernel.org (3.1+)
Reported-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agomd: raid1/raid10 md devices leak memory when stopping
Hirokazu Takahashi [Wed, 24 Apr 2013 01:42:44 +0000 (11:42 +1000)]
md: raid1/raid10 md devices leak memory when stopping

Hi.

Raid1 and raid10 devices leak memory every time they stop.
This is a patch for linux-3.9.0-rc7 to fix this problem.

Thanks,
Hirokazu Takahashi.

Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agounix/stream: fix peeking with an offset larger than data in queue
Benjamin Poirier [Mon, 29 Apr 2013 11:42:14 +0000 (11:42 +0000)]
unix/stream: fix peeking with an offset larger than data in queue

Currently, peeking on a unix stream socket with an offset larger than len of
the data in the sk receive queue returns immediately with bogus data.

This patch fixes this so that the behavior is the same as peeking with no
offset on an empty queue: the caller blocks.

Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agounix/dgram: fix peeking with an offset larger than data in queue
Benjamin Poirier [Mon, 29 Apr 2013 11:42:13 +0000 (11:42 +0000)]
unix/dgram: fix peeking with an offset larger than data in queue

Currently, peeking on a unix datagram socket with an offset larger than len of
the data in the sk receive queue returns immediately with bogus data. That's
because *off is not reset between each skb_queue_walk().

This patch fixes this so that the behavior is the same as peeking with no
offset on an empty queue: the caller blocks.

Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agounix/dgram: peek beyond 0-sized skbs
Benjamin Poirier [Mon, 29 Apr 2013 11:42:12 +0000 (11:42 +0000)]
unix/dgram: peek beyond 0-sized skbs

"77c1090 net: fix infinite loop in __skb_recv_datagram()" (v3.8) introduced a
regression:
After that commit, recv can no longer peek beyond a 0-sized skb in the queue.
__skb_recv_datagram() instead stops at the first skb with len == 0 and results
in the system call failing with -EFAULT via skb_copy_datagram_iovec().

When peeking at an offset with 0-sized skb(s), each one of those is received
only once, in sequence. The offset starts moving forward again after receiving
datagrams with len > 0.

Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoopenvswitch: Remove unneeded ovs_netdev_get_ifindex()
Thomas Graf [Mon, 29 Apr 2013 13:06:41 +0000 (13:06 +0000)]
openvswitch: Remove unneeded ovs_netdev_get_ifindex()

The only user is get_dpifindex(), no need to redirect via the port
operations.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Use consume_skb() to free gso segmented skb
Sridhar Samudrala [Mon, 29 Apr 2013 13:02:42 +0000 (13:02 +0000)]
net: Use consume_skb() to free gso segmented skb

Use consume_skb() to free the original skb that is successfully transmitted
as gso segmented skbs so that it is not treated as a drop due to an error.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Tue, 30 Apr 2013 04:11:37 +0000 (00:11 -0400)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next

John W. Linville says:

====================
A few more stragglers intended for 3.10...

For the Bluetooth bits, Gustavo says:

"A few more patches intended for 3.10, the most important one is the support in
btusb for fw loading for the Intel Bluetooth device. Other than that we have
only fixes and clean ups."

For the iwlwifi bits, Johannes says:

"Here are a few more changes for the 3.10 stream, some bugfixes,
adjustments to some powersave parameters and a new device ID."

For the NFC bits, Samuel says:

"This pull request includes Marcel's Kconfig dependency fix on top of the LLCP
code move to net/nfc."

On top of that...Yogesh Ashok Powar provides a few PCI-related mwifiex
updates, Hauke Mehrtens provides a small ssb feature for spurious
tone avoidance on a specific chip, and Larry Finger provides a small
rtlwifi fix related to avoiding false detection of AP loss.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agof2fs: modify the number of issued pages to merge IOs
Jaegeuk Kim [Mon, 29 Apr 2013 07:58:39 +0000 (16:58 +0900)]
f2fs: modify the number of issued pages to merge IOs

When testing f2fs on an SSD, I found some 128 page IOs followed by 1 page IO
were issued by f2fs_write_node_pages.
This means that there were some mishandling flows which degrades performance.

Previous f2fs_write_node_pages determines the number of pages to be written,
nr_to_write, as follows.

1. The bio_get_nr_vecs returns 129 pages.
2. The bio_alloc makes a room for 128 pages.
3. The initial 128 pages go into one bio.
4. The existing bio is submitted, and a new bio is prepared for the last 1 page.
5. Finally, sync_node_pages submits the last 1 page bio.

The problem is from the use of bio_get_nr_vecs, so this patch replace it
with max_hw_blocks using queue_max_sectors.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
12 years agof2fs: remove useless #include <linux/proc_fs.h> as we're now using sysfs as debug...
Haicheng Li [Sun, 28 Apr 2013 11:16:07 +0000 (19:16 +0800)]
f2fs: remove useless #include <linux/proc_fs.h> as we're now using sysfs as debug entry.

Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
12 years agof2fs: fix inconsistent using of NM_WOUT_THRESHOLD
Haicheng Li [Sun, 28 Apr 2013 11:16:06 +0000 (19:16 +0800)]
f2fs: fix inconsistent using of NM_WOUT_THRESHOLD

try_to_free_nats() is usually called with parameter nr_shrink as
"nm_i->nat_cnt - NM_WOUT_THRESHOLD"
by flush_nat_entries() during checkpointing process.

However, this is inconsistent with the actual threshold check as
"if (nm_i->nat_cnt < 2 * NM_WOUT_THRESHOLD)"
, which will ignore the free_nats requests when
NM_WOUT_THRESHOLD < nm_i->nat_cnt < 2 * NM_WOUT_THRESHOLD

So fix the threshold check condition.

Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
12 years agoMerge branch 'akpm' (incoming from Andrew)
Linus Torvalds [Tue, 30 Apr 2013 02:47:50 +0000 (19:47 -0700)]
Merge branch 'akpm' (incoming from Andrew)

Merge second batch of fixes from Andrew Morton:

 - various misc bits

 - some printk updates

 - a new "SRAM" driver.

 - MAINTAINERS updates

 - the backlight driver queue

 - checkpatch updates

 - a few init/ changes

 - a huge number of drivers/rtc changes

 - fatfs updates

 - some lib/idr.c work

 - some renaming of the random driver interfaces

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (285 commits)
  net: rename random32 to prandom
  net/core: remove duplicate statements by do-while loop
  net/core: rename random32() to prandom_u32()
  net/netfilter: rename random32() to prandom_u32()
  net/sched: rename random32() to prandom_u32()
  net/sunrpc: rename random32() to prandom_u32()
  scsi: rename random32() to prandom_u32()
  lguest: rename random32() to prandom_u32()
  uwb: rename random32() to prandom_u32()
  video/uvesafb: rename random32() to prandom_u32()
  mmc: rename random32() to prandom_u32()
  drbd: rename random32() to prandom_u32()
  kernel/: rename random32() to prandom_u32()
  mm/: rename random32() to prandom_u32()
  lib/: rename random32() to prandom_u32()
  x86: rename random32() to prandom_u32()
  x86: pageattr-test: remove srandom32 call
  uuid: use prandom_bytes()
  raid6test: use prandom_bytes()
  sctp: convert sctp_assoc_set_id() to use idr_alloc_cyclic()
  ...

12 years agoMerge branch 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Linus Torvalds [Tue, 30 Apr 2013 02:14:20 +0000 (19:14 -0700)]
Merge branch 'for-3.10' of git://git./linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - Fixes and a lot of cleanups.  Locking cleanup is finally complete.
   cgroup_mutex is no longer exposed to individual controlelrs which
   used to cause nasty deadlock issues.  Li fixed and cleaned up quite a
   bit including long standing ones like racy cgroup_path().

 - device cgroup now supports proper hierarchy thanks to Aristeu.

 - perf_event cgroup now supports proper hierarchy.

 - A new mount option "__DEVEL__sane_behavior" is added.  As indicated
   by the name, this option is to be used for development only at this
   point and generates a warning message when used.  Unfortunately,
   cgroup interface currently has too many brekages and inconsistencies
   to implement a consistent and unified hierarchy on top.  The new flag
   is used to collect the behavior changes which are necessary to
   implement consistent unified hierarchy.  It's likely that this flag
   won't be used verbatim when it becomes ready but will be enabled
   implicitly along with unified hierarchy.

   The option currently disables some of broken behaviors in cgroup core
   and also .use_hierarchy switch in memcg (will be routed through -mm),
   which can be used to make very unusual hierarchy where nesting is
   partially honored.  It will also be used to implement hierarchy
   support for blk-throttle which would be impossible otherwise without
   introducing a full separate set of control knobs.

   This is essentially versioning of interface which isn't very nice but
   at this point I can't see any other options which would allow keeping
   the interface the same while moving towards hierarchy behavior which
   is at least somewhat sane.  The planned unified hierarchy is likely
   to require some level of adaptation from userland anyway, so I think
   it'd be best to take the chance and update the interface such that
   it's supportable in the long term.

   Maintaining the existing interface does complicate cgroup core but
   shouldn't put too much strain on individual controllers and I think
   it'd be manageable for the foreseeable future.  Maybe we'll be able
   to drop it in a decade.

Fix up conflicts (including a semantic one adding a new #include to ppc
that was uncovered by header the file changes) as per Tejun.

* 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (45 commits)
  cpuset: fix compile warning when CONFIG_SMP=n
  cpuset: fix cpu hotplug vs rebuild_sched_domains() race
  cpuset: use rebuild_sched_domains() in cpuset_hotplug_workfn()
  cgroup: restore the call to eventfd->poll()
  cgroup: fix use-after-free when umounting cgroupfs
  cgroup: fix broken file xattrs
  devcg: remove parent_cgroup.
  memcg: force use_hierarchy if sane_behavior
  cgroup: remove cgrp->top_cgroup
  cgroup: introduce sane_behavior mount option
  move cgroupfs_root to include/linux/cgroup.h
  cgroup: convert cgroupfs_root flag bits to masks and add CGRP_ prefix
  cgroup: make cgroup_path() not print double slashes
  Revert "cgroup: remove bind() method from cgroup_subsys."
  perf: make perf_event cgroup hierarchical
  cgroup: implement cgroup_is_descendant()
  cgroup: make sure parent won't be destroyed before its children
  cgroup: remove bind() method from cgroup_subsys.
  devcg: remove broken_hierarchy tag
  cgroup: remove cgroup_lock_is_held()
  ...

12 years agokvm: KVM_CAP_IOMMU only available with device assignment
Alex Williamson [Mon, 29 Apr 2013 16:54:08 +0000 (10:54 -0600)]
kvm: KVM_CAP_IOMMU only available with device assignment

Fix build with CONFIG_PCI unset by linking KVM_CAP_IOMMU to
device assignment config option.  It has no purpose otherwise.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
12 years agoMerge branch 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Linus Torvalds [Tue, 30 Apr 2013 02:07:40 +0000 (19:07 -0700)]
Merge branch 'for-3.10' of git://git./linux/kernel/git/tj/wq

Pull workqueue updates from Tejun Heo:
 "A lot of activities on workqueue side this time.  The changes achieve
  the followings.

   - WQ_UNBOUND workqueues - the workqueues which are per-cpu - are
     updated to be able to interface with multiple backend worker pools.
     This involved a lot of churning but the end result seems actually
     neater as unbound workqueues are now a lot closer to per-cpu ones.

   - The ability to interface with multiple backend worker pools are
     used to implement unbound workqueues with custom attributes.
     Currently the supported attributes are the nice level and CPU
     affinity.  It may be expanded to include cgroup association in
     future.  The attributes can be specified either by calling
     apply_workqueue_attrs() or through /sys/bus/workqueue/WQ_NAME/* if
     the workqueue in question is exported through sysfs.

     The backend worker pools are keyed by the actual attributes and
     shared by any workqueues which share the same attributes.  When
     attributes of a workqueue are changed, the workqueue binds to the
     worker pool with the specified attributes while leaving the work
     items which are already executing in its previous worker pools
     alone.

     This allows converting custom worker pool implementations which
     want worker attribute tuning to use workqueues.  The writeback pool
     is already converted in block tree and there are a couple others
     are likely to follow including btrfs io workers.

   - WQ_UNBOUND's ability to bind to multiple worker pools is also used
     to make it NUMA-aware.  Because there's no association between work
     item issuer and the specific worker assigned to execute it, before
     this change, using unbound workqueue led to unnecessary cross-node
     bouncing and it couldn't be helped by autonuma as it requires tasks
     to have implicit node affinity and workers are assigned randomly.

     After these changes, an unbound workqueue now binds to multiple
     NUMA-affine worker pools so that queued work items are executed in
     the same node.  This is turned on by default but can be disabled
     system-wide or for individual workqueues.

     Crypto was requesting NUMA affinity as encrypting data across
     different nodes can contribute noticeable overhead and doing it
     per-cpu was too limiting for certain cases and IO throughput could
     be bottlenecked by one CPU being fully occupied while others have
     idle cycles.

  While the new features required a lot of changes including
  restructuring locking, it didn't complicate the execution paths much.
  The unbound workqueue handling is now closer to per-cpu ones and the
  new features are implemented by simply associating a workqueue with
  different sets of backend worker pools without changing queue,
  execution or flush paths.

  As such, even though the amount of change is very high, I feel
  relatively safe in that it isn't likely to cause subtle issues with
  basic correctness of work item execution and handling.  If something
  is wrong, it's likely to show up as being associated with worker pools
  with the wrong attributes or OOPS while workqueue attributes are being
  changed or during CPU hotplug.

  While this creates more backend worker pools, it doesn't add too many
  more workers unless, of course, there are many workqueues with unique
  combinations of attributes.  Assuming everything else is the same,
  NUMA awareness costs an extra worker pool per NUMA node with online
  CPUs.

  There are also a couple things which are being routed outside the
  workqueue tree.

   - block tree pulled in workqueue for-3.10 so that writeback worker
     pool can be converted to unbound workqueue with sysfs control
     exposed.  This simplifies the code, makes writeback workers
     NUMA-aware and allows tuning nice level and CPU affinity via sysfs.

   - The conversion to workqueue means that there's no 1:1 association
     between a specific worker, which makes writeback folks unhappy as
     they want to be able to tell which filesystem caused a problem from
     backtrace on systems with many filesystems mounted.  This is
     resolved by allowing work items to set debug info string which is
     printed when the task is dumped.  As this change involves unifying
     implementations of dump_stack() and friends in arch codes, it's
     being routed through Andrew's -mm tree."

* 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (84 commits)
  workqueue: use kmem_cache_free() instead of kfree()
  workqueue: avoid false negative WARN_ON() in destroy_workqueue()
  workqueue: update sysfs interface to reflect NUMA awareness and a kernel param to disable NUMA affinity
  workqueue: implement NUMA affinity for unbound workqueues
  workqueue: introduce put_pwq_unlocked()
  workqueue: introduce numa_pwq_tbl_install()
  workqueue: use NUMA-aware allocation for pool_workqueues
  workqueue: break init_and_link_pwq() into two functions and introduce alloc_unbound_pwq()
  workqueue: map an unbound workqueues to multiple per-node pool_workqueues
  workqueue: move hot fields of workqueue_struct to the end
  workqueue: make workqueue->name[] fixed len
  workqueue: add workqueue->unbound_attrs
  workqueue: determine NUMA node of workers accourding to the allowed cpumask
  workqueue: drop 'H' from kworker names of unbound worker pools
  workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]
  workqueue: move pwq_pool_locking outside of get/put_unbound_pool()
  workqueue: fix memory leak in apply_workqueue_attrs()
  workqueue: fix unbound workqueue attrs hashing / comparison
  workqueue: fix race condition in unbound workqueue free path
  workqueue: remove pwq_lock which is no longer used
  ...

12 years agoMerge branch 'for-3.10-async' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Linus Torvalds [Tue, 30 Apr 2013 02:06:59 +0000 (19:06 -0700)]
Merge branch 'for-3.10-async' of git://git./linux/kernel/git/tj/wq

Pull async update from Tejun Heo:
 "This contains three cleanup patches for async from Lai.  All three
  patches are essentially cosmetic."

* 'for-3.10-async' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  async: rename and redefine async_func_ptr
  async: remove unused @node from struct async_domain
  async: simplify lowest_in_progress()

12 years agoMerge branch 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Linus Torvalds [Tue, 30 Apr 2013 02:06:16 +0000 (19:06 -0700)]
Merge branch 'for-3.10' of git://git./linux/kernel/git/tj/percpu

Pull percpu patch from Tejun Heo:
 "A puny pull request for percpu.  We were expecting more cleanup
  patches but didn't happen this time, so just a single patch adding
  documentation from Christoph."

* 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: add documentation on this_cpu operations

12 years agonet: rename random32 to prandom
Akinobu Mita [Mon, 29 Apr 2013 23:21:42 +0000 (16:21 -0700)]
net: rename random32 to prandom

Commit 496f2f93b1cc ("random32: rename random32 to prandom") renamed
random32() and srandom32() to prandom_u32() and prandom_seed()
respectively.

net_random() and net_srandom() need to be redefined with prandom_* in
order to finish the naming transition.

While I'm at it, enclose macro argument of net_srandom() with parenthesis.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agonet/core: remove duplicate statements by do-while loop
Akinobu Mita [Mon, 29 Apr 2013 23:21:41 +0000 (16:21 -0700)]
net/core: remove duplicate statements by do-while loop

Remove duplicate statements by using do-while loop instead of while loop.

- A;
- while (e) {
+ do {
A;
- }
+ } while (e);

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agonet/core: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:40 +0000 (16:21 -0700)]
net/core: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agonet/netfilter: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:39 +0000 (16:21 -0700)]
net/netfilter: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agonet/sched: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:38 +0000 (16:21 -0700)]
net/sched: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agonet/sunrpc: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:37 +0000 (16:21 -0700)]
net/sunrpc: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoscsi: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:35 +0000 (16:21 -0700)]
scsi: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: James Smart <james.smart@emulex.com>
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agolguest: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:34 +0000 (16:21 -0700)]
lguest: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agouwb: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:34 +0000 (16:21 -0700)]
uwb: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agovideo/uvesafb: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:32 +0000 (16:21 -0700)]
video/uvesafb: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agommc: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:31 +0000 (16:21 -0700)]
mmc: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Chris Ball <cjb@laptop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodrbd: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:31 +0000 (16:21 -0700)]
drbd: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agokernel/: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:30 +0000 (16:21 -0700)]
kernel/: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomm/: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:29 +0000 (16:21 -0700)]
mm/: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agolib/: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:28 +0000 (16:21 -0700)]
lib/: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agox86: rename random32() to prandom_u32()
Akinobu Mita [Mon, 29 Apr 2013 23:21:27 +0000 (16:21 -0700)]
x86: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agox86: pageattr-test: remove srandom32 call
Akinobu Mita [Mon, 29 Apr 2013 23:21:26 +0000 (16:21 -0700)]
x86: pageattr-test: remove srandom32 call

pageattr-test calls srandom32() once every test iteration.  But calling
srandom32() after late_initcalls is not meaningfull.  Because the random
states for random32() is mixed by good random numbers in late_initcall
prandom_reseed().

So this removes the call to srandom32().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agouuid: use prandom_bytes()
Akinobu Mita [Mon, 29 Apr 2013 23:21:25 +0000 (16:21 -0700)]
uuid: use prandom_bytes()

Use prandom_bytes() to generate 16 bytes of pseudo-random bytes.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoraid6test: use prandom_bytes()
Akinobu Mita [Mon, 29 Apr 2013 23:21:24 +0000 (16:21 -0700)]
raid6test: use prandom_bytes()

Use prandom_bytes() to generate random bytes for test data.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Dan Williams <djbw@fb.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agosctp: convert sctp_assoc_set_id() to use idr_alloc_cyclic()
Jeff Layton [Mon, 29 Apr 2013 23:21:22 +0000 (16:21 -0700)]
sctp: convert sctp_assoc_set_id() to use idr_alloc_cyclic()

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoinotify: convert inotify_add_to_idr() to use idr_alloc_cyclic()
Jeff Layton [Mon, 29 Apr 2013 23:21:21 +0000 (16:21 -0700)]
inotify: convert inotify_add_to_idr() to use idr_alloc_cyclic()

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agonfsd: convert nfs4_alloc_stid() to use idr_alloc_cyclic()
Jeff Layton [Mon, 29 Apr 2013 23:21:20 +0000 (16:21 -0700)]
nfsd: convert nfs4_alloc_stid() to use idr_alloc_cyclic()

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodrivers/infiniband/hw/mlx4: convert to using idr_alloc_cyclic()
Jeff Layton [Mon, 29 Apr 2013 23:21:19 +0000 (16:21 -0700)]
drivers/infiniband/hw/mlx4: convert to using idr_alloc_cyclic()

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodrivers/infiniband/hw/amso1100: convert to using idr_alloc_cyclic
Jeff Layton [Mon, 29 Apr 2013 23:21:18 +0000 (16:21 -0700)]
drivers/infiniband/hw/amso1100: convert to using idr_alloc_cyclic

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoidr: introduce idr_alloc_cyclic()
Jeff Layton [Mon, 29 Apr 2013 23:21:16 +0000 (16:21 -0700)]
idr: introduce idr_alloc_cyclic()

As Tejun points out, there are several users of the IDR facility that
attempt to use it in a cyclic fashion.  These users are likely to see
-ENOSPC errors after the counter wraps one or more times however.

This patchset adds a new idr_alloc_cyclic routine and converts several
of these users to it.  Many of these users are in obscure parts of the
kernel, and I don't have a good way to test some of them.  The change is
pretty straightforward though, so hopefully it won't be an issue.

There is one other cyclic user of idr_alloc that I didn't touch in
ipc/util.c.  That one is doing some strange stuff that I didn't quite
understand, but it looks like it should probably be converted later
somehow.

This patch:

Thus spake Tejun Heo:

    Ooh, BTW, the cyclic allocation is broken.  It's prone to -ENOSPC
    after the first wraparound.  There are several cyclic users in the
    kernel and I think it probably would be best to implement cyclic
    support in idr.

This patch does that by adding new idr_alloc_cyclic function that such
users in the kernel can use.  With this, there's no need for a caller to
keep track of the last value used as that's now tracked internally.  This
should prevent the ENOSPC problems that can hit when the "last allocated"
counter exceeds INT_MAX.

Later patches will convert existing cyclic users to the new interface.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Tom Tucker <tom@opengridcomputing.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoDocumentation: update nfs option in filesystem/vfat.txt
Namjae Jeon [Mon, 29 Apr 2013 23:21:15 +0000 (16:21 -0700)]
Documentation: update nfs option in filesystem/vfat.txt

Add descriptions about 'stale_rw' and 'nostale_ro' nfs options in
filesystem/vfat.txt

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Rob Landley <rob@landley.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agofat (exportfs): rebuild directory-inode if fat_dget()
Namjae Jeon [Mon, 29 Apr 2013 23:21:14 +0000 (16:21 -0700)]
fat (exportfs): rebuild directory-inode if fat_dget()

This patch enables rebuilding of directory inodes which are not present in
the cache.This is done by traversing the disk clusters to find the
directory entry of the parent directory and using its i_pos to build the
inode.

The traversal is done by fat_scan_logstart() which is similar to
fat_scan() but matches i_pos values instead of names.fat_scan_logstart()
needs an inode parameter to work, for which a dummy inode is created by
it's caller fat_rebuild_parent().  This dummy inode is destroyed after the
traversal completes.

All this is done  only if the nostale_ro nfs mount option is specified.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agofat (exportfs): rebuild inode if ilookup() fails
Namjae Jeon [Mon, 29 Apr 2013 23:21:12 +0000 (16:21 -0700)]
fat (exportfs): rebuild inode if ilookup() fails

If the cache lookups fail,use the i_pos value to find the directory entry
of the inode and rebuild the inode.Since this involves accessing the FAT
media, do this only if the nostale_ro nfs mount option is specified.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agofat: restructure export_operations
Namjae Jeon [Mon, 29 Apr 2013 23:21:11 +0000 (16:21 -0700)]
fat: restructure export_operations

Define two nfs export_operation structures,one for 'stale_rw' mounts and
the other for 'nostale_ro'.  The latter uses i_pos as a basis for encoding
and decoding file handles.

Also, assign i_pos to kstat->ino.  The logic for rebuilding the inode is
added in the subsequent patches.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agofat: introduce a helper fat_get_blknr_offset()
Namjae Jeon [Mon, 29 Apr 2013 23:21:10 +0000 (16:21 -0700)]
fat: introduce a helper fat_get_blknr_offset()

Introduce helper function to get the block number and offset for a given
i_pos value.  Use it in __fat_write_inode() now and later on in nfs.c

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agofat: move fat_i_pos_read to fat.h
Namjae Jeon [Mon, 29 Apr 2013 23:21:09 +0000 (16:21 -0700)]
fat: move fat_i_pos_read to fat.h

Move fat_i_pos_read to fat.h so that it can be called from nfs.c in the
subsequent patches to encode the file handle.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agofat: introduce 2 new values for the -o nfs mount option
Namjae Jeon [Mon, 29 Apr 2013 23:21:08 +0000 (16:21 -0700)]
fat: introduce 2 new values for the -o nfs mount option

This patchset eliminates the client side ESTALE errors when a FAT
partition exported over NFS has its dentries evicted from the cache.  The
idea is to find the on-disk location_'i_pos' of the dirent of the inode
that has been evicted and use it to rebuild the inode.

This patch:

Provide two possible values 'stale_rw' and 'nostale_ro' for the -o nfs
mount option.The first one allows all file operations but does not reduce
ESTALE errors on memory constrained systems.  The second one eliminates
ESTALE errors but mounts the filesystem as read-only.  Not specifying a
value defaults to 'stale_rw'.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ravishankar N <ravi.n1@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodrivers/rtc/rtc-pcf2123.c: fix error return code in pcf2123_probe()
Wei Yongjun [Mon, 29 Apr 2013 23:21:07 +0000 (16:21 -0700)]
drivers/rtc/rtc-pcf2123.c: fix error return code in pcf2123_probe()

Fix to return -ENODEV in the chip not found error handling
case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodrivers/rtc/rtc-isl12022.c: Remove rtc8564 from isl12022_id
Axel Lin [Mon, 29 Apr 2013 23:21:06 +0000 (16:21 -0700)]
drivers/rtc/rtc-isl12022.c: Remove rtc8564 from isl12022_id

rtc8564 appears in i2c_device_id table of both rtc-isl12022.c and
rtc-pcf8563.c.  Commit 8ea9212cbd65 "rtc-pcf8563: add chip id" added the
rtc8564 chip entry to pcf8563.  isl12022 driver is modified from pcf8563
driver, so this looks like a copy-paste bug.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Cc: Roman Fietze <roman.fietze@telemotive.de>
Cc: Jon Smirl <jonsmirl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodrivers/rtc/rtc-at91rm9200.c: fix missing iounmap
Johan Hovold [Mon, 29 Apr 2013 23:21:05 +0000 (16:21 -0700)]
drivers/rtc/rtc-at91rm9200.c: fix missing iounmap

Add missing iounmap to probe error path and remove.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agortc: rtc-twl: convert twl4030rtc_driver to dev_pm_ops
Jingoo Han [Mon, 29 Apr 2013 23:21:04 +0000 (16:21 -0700)]
rtc: rtc-twl: convert twl4030rtc_driver to dev_pm_ops

Instead of using legacy suspend/resume methods, using newer dev_pm_ops
structure allows better control over power management.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agortc: rtc-stmp3xxx: convert stmp3xxx_rtcdrv to dev_pm_ops
Jingoo Han [Mon, 29 Apr 2013 23:21:03 +0000 (16:21 -0700)]
rtc: rtc-stmp3xxx: convert stmp3xxx_rtcdrv to dev_pm_ops

Instead of using legacy suspend/resume methods, using newer dev_pm_ops
structure allows better control over power management.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agortc: rtc-spear: convert spear_rtc_driver to dev_pm_ops
Jingoo Han [Mon, 29 Apr 2013 23:21:02 +0000 (16:21 -0700)]
rtc: rtc-spear: convert spear_rtc_driver to dev_pm_ops

Instead of using legacy suspend/resume methods, using newer dev_pm_ops
structure allows better control over power management.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agortc: rtc-puv3: convert puv3_rtc_driver to dev_pm_ops
Jingoo Han [Mon, 29 Apr 2013 23:21:02 +0000 (16:21 -0700)]
rtc: rtc-puv3: convert puv3_rtc_driver to dev_pm_ops

Instead of using legacy suspend/resume methods, using newer dev_pm_ops
structure allows better control over power management.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>