commit e3f681ae691bd64d1dc06bab9ad4c98ad4effa82 Author: Greg Kroah-Hartman Date: Sat Oct 13 05:44:59 2012 +0900 Linux 3.4.14 commit 93875bc33f71bf7a4851b44d387743866b13b130 Author: Mike Galbraith Date: Sat Aug 4 05:44:14 2012 +0200 sched: Fix migration thread runtime bogosity commit 8f6189684eb4e85e6c593cd710693f09c944450a upstream. Make stop scheduler class do the same accounting as other classes, Migration threads can be caught in the act while doing exec balancing, leading to the below due to use of unmaintained ->se.exec_start. The load that triggered this particular instance was an apparently out of control heavily threaded application that does system monitoring in what equated to an exec bomb, with one of the VERY frequently migrated tasks being ps. %CPU PID USER CMD 99.3 45 root [migration/10] 97.7 53 root [migration/12] 97.0 57 root [migration/13] 90.1 49 root [migration/11] 89.6 65 root [migration/15] 88.7 17 root [migration/3] 80.4 37 root [migration/8] 78.1 41 root [migration/9] 44.2 13 root [migration/2] Signed-off-by: Mike Galbraith Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1344051854.6739.19.camel@marge.simpson.net Signed-off-by: Thomas Gleixner Cc: Steven Rostedt Signed-off-by: Greg Kroah-Hartman commit e08ea4c7d4e90aab75307c3c8018d2d4bbeebc0b Author: Nikola Pajkovsky Date: Wed Aug 15 00:38:08 2012 +0200 udf: fix retun value on error path in udf_load_logicalvol commit 68766a2edcd5cd744262a70a2f67a320ac944760 upstream. In case we detect a problem and bail out, we fail to set "ret" to a nonzero value, and udf_load_logicalvol will mistakenly report success. Signed-off-by: Nikola Pajkovsky Signed-off-by: Jan Kara Cc: Herton Ronaldo Krzesinski Signed-off-by: Greg Kroah-Hartman commit a71d78897ad4b0df7239269f5772bd059a0a2541 Author: Frediano Ziglio Date: Tue Aug 7 04:33:03 2012 -0500 Convert properly UTF-8 to UTF-16 commit fd3ba42c76d3d4b776120c2b24c1791e7bb3deb1 upstream. wchar_t is currently 16bit so converting a utf8 encoded characters not in plane 0 (>= 0x10000) to wchar_t (that is calling char2uni) lead to a -EINVAL return. This patch detect utf8 in cifs_strtoUTF16 and add special code calling utf8s_to_utf16s. Signed-off-by: Frediano Ziglio Acked-by: Jeff Layton Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 84b7167f1ab14715edb851b17c011d645dbdd312 Author: Jeff Layton Date: Wed Oct 3 16:02:36 2012 -0400 cifs: reinstate the forcegid option commit 72bd481f860f0125c810bb43d878ce5f9c060c58 upstream. Apparently this was lost when we converted to the standard option parser in 8830d7e07a5e38bc47650a7554b7c1cfd49902bf Reported-by: Gregory Lee Bartholomew Cc: Sachin Prabhu Signed-off-by: Jeff Layton Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit b9190164feb7046a23029eb00d0bb67f500d03b7 Author: Brian Norris Date: Fri Aug 31 15:01:19 2012 -0700 JFFS2: don't fail on bitflips in OOB commit 74d83beaa229aac7d126ac1ed9414658ff1a89d2 upstream. JFFS2 was designed without thought for OOB bitflips, it seems, but they can occur and will be reported to JFFS2 via mtd_read_oob()[1]. We don't want to fail on these transactions, since the data was corrected. [1] Few drivers report bitflips for OOB-only transactions. With such drivers, this patch should have no effect. Signed-off-by: Brian Norris Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 719740d5deffe22dfdb2bbb06717c1763ebf178e Author: Guennadi Liakhovetski Date: Tue Sep 18 06:42:42 2012 +0000 mmc: sh-mmcif: avoid oops on spurious interrupts commit 8464dd52d3198dd05cafb005371d76e5339eb842 upstream. On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious interrupts without any active request. To prevent the Oops, that results in such cases, don't dereference the mmc request pointer until we make sure, that we are indeed processing such a request. Reported-by: Tetsuyuki Kobayashi Signed-off-by: Guennadi Liakhovetski Signed-off-by: Chris Ball Signed-off-by: Greg Kroah-Hartman commit 3883e2826411675a47cab6e8e8f49288c4452e34 Author: Vaibhav Bedia Date: Thu Sep 13 06:31:03 2012 +0000 mmc: omap_hsmmc: Pass on the suspend failure to the PM core commit c4c8eeb4df00aabb641553d6fbcd46f458e56cd9 upstream. In some cases mmc_suspend_host() is not able to claim the host and proceed with the suspend process. The core returns -EBUSY to the host controller driver. Unfortunately, the host controller driver does not pass on this information to the PM core and hence the system suspend process continues. ret = mmc_suspend_host(host->mmc); if (ret) { host->suspended = 0; if (host->pdata->resume) { ret = host->pdata->resume(dev, host->slot_id); The return status from mmc_suspend_host() is overwritten by return status from host->pdata->resume. So the original return status is lost. In these cases the MMC core gets to an unexpected state during resume and multiple issues related to MMC crop up. 1. Host controller driver starts accessing the device registers before the clocks are enabled which leads to a prefetch abort. 2. A file copy thread which was launched before suspend gets stuck due to the host not being reclaimed during resume. To avoid such problems pass on the -EBUSY status to the PM core from the host controller driver. With this change, MMC core suspend might still fail but it does not end up making the system unusable. Suspend gets aborted and the user can try suspending the system again. Signed-off-by: Vaibhav Bedia Signed-off-by: Hebbar, Gururaja Acked-by: Venkatraman S Signed-off-by: Chris Ball Signed-off-by: Greg Kroah-Hartman commit dbb28ef5cf5318b0abb95712f1cfafaa032796e5 Author: Andreas Bießmann Date: Fri Aug 31 13:35:42 2012 +0200 mtd: omap2: fix module loading commit 4d3d688da8e7016f15483e9319b41311e1db9515 upstream. Unloading the omap2 nand driver missed to release the memory region which will result in not being able to request it again if one want to load the driver later on. This patch fixes following error when loading omap2 module after unloading: ---8<--- ~ $ rmmod omap2 ~ $ modprobe omap2 [ 37.420928] omap2-nand: probe of omap2-nand.0 failed with error -16 ~ $ --->8--- This error was introduced in 67ce04bf2746f8a1f8c2a104b313d20c63f68378 which was the first commit of this driver. Signed-off-by: Andreas Bießmann Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 7c6313717892eabf93b53d8a112263f41646ed4c Author: Andreas Bießmann Date: Fri Aug 31 13:35:41 2012 +0200 mtd: omap2: fix omap_nand_remove segfault commit 7d9b110269253b1d5858cfa57d68dfc7bf50dd77 upstream. Do not kfree() the mtd_info; it is handled in the mtd subsystem and already freed by nand_release(). Instead kfree() the struct omap_nand_info allocated in omap_nand_probe which was not freed before. This patch fixes following error when unloading the omap2 module: ---8<--- ~ $ rmmod omap2 ------------[ cut here ]------------ kernel BUG at mm/slab.c:3126! Internal error: Oops - BUG: 0 [#1] PREEMPT ARM Modules linked in: omap2(-) CPU: 0 Not tainted (3.6.0-rc3-00230-g155e36d-dirty #3) PC is at cache_free_debugcheck+0x2d4/0x36c LR is at kfree+0xc8/0x2ac pc : [] lr : [] psr: 200d0193 sp : c521fe08 ip : c0e8ef90 fp : c521fe5c r10: bf0001fc r9 : c521e000 r8 : c0d99c8c r7 : c661ebc0 r6 : c065d5a4 r5 : c65c4060 r4 : c78005c0 r3 : 00000000 r2 : 00001000 r1 : c65c4000 r0 : 00000001 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 86694019 DAC: 00000015 Process rmmod (pid: 549, stack limit = 0xc521e2f0) Stack: (0xc521fe08 to 0xc5220000) fe00: c008a874 c00bf44c c515c6d0 200d0193 c65c4860 c515c240 fe20: c521fe3c c521fe30 c008a9c0 c008a854 c521fe5c c65c4860 c78005c0 bf0001fc fe40: c780ff40 a00d0113 c521e000 00000000 c521fe84 c521fe60 c0112efc c01122d8 fe60: c65c4860 c0673778 c06737ac 00000000 00070013 00000000 c521fe9c c521fe88 fe80: bf0001fc c0112e40 c0673778 bf001ca8 c521feac c521fea0 c02ca11c bf0001ac fea0: c521fec4 c521feb0 c02c82c4 c02ca100 c0673778 bf001ca8 c521fee4 c521fec8 fec0: c02c8dd8 c02c8250 00000000 bf001ca8 bf001ca8 c0804ee0 c521ff04 c521fee8 fee0: c02c804c c02c8d20 bf001924 00000000 bf001ca8 c521e000 c521ff1c c521ff08 ff00: c02c950c c02c7fbc bf001d48 00000000 c521ff2c c521ff20 c02ca3a4 c02c94b8 ff20: c521ff3c c521ff30 bf001938 c02ca394 c521ffa4 c521ff40 c009beb4 bf001930 ff40: c521ff6c 70616d6f b6fe0032 c0014f84 70616d6f b6fe0032 00000081 60070010 ff60: c521ff84 c521ff70 c008e1f4 c00bf328 0001a004 70616d6f c521ff94 0021ff88 ff80: c008e368 0001a004 70616d6f b6fe0032 00000081 c0015028 00000000 c521ffa8 ffa0: c0014dc0 c009bcd0 0001a004 70616d6f bec2ab38 00000880 bec2ab38 00000880 ffc0: 0001a004 70616d6f b6fe0032 00000081 00000319 00000000 b6fe1000 00000000 ffe0: bec2ab30 bec2ab20 00019f00 b6f539c0 60070010 bec2ab38 aaaaaaaa aaaaaaaa Backtrace: [] (cache_free_debugcheck+0x0/0x36c) from [] (kfree+0xc8/0x2ac) [] (kfree+0x0/0x2ac) from [] (omap_nand_remove+0x5c/0x64 [omap2]) [] (omap_nand_remove+0x0/0x64 [omap2]) from [] (platform_drv_remove+0x28/0x2c) r5:bf001ca8 r4:c0673778 [] (platform_drv_remove+0x0/0x2c) from [] (__device_release_driver+0x80/0xdc) [] (__device_release_driver+0x0/0xdc) from [] (driver_detach+0xc4/0xc8) r5:bf001ca8 r4:c0673778 [] (driver_detach+0x0/0xc8) from [] (bus_remove_driver+0x9c/0x104) r6:c0804ee0 r5:bf001ca8 r4:bf001ca8 r3:00000000 [] (bus_remove_driver+0x0/0x104) from [] (driver_unregister+0x60/0x80) r6:c521e000 r5:bf001ca8 r4:00000000 r3:bf001924 [] (driver_unregister+0x0/0x80) from [] (platform_driver_unregister+0x1c/0x20) r5:00000000 r4:bf001d48 [] (platform_driver_unregister+0x0/0x20) from [] (omap_nand_driver_exit+0x14/0x1c [omap2]) [] (omap_nand_driver_exit+0x0/0x1c [omap2]) from [] (sys_delete_module+0x1f0/0x2ec) [] (sys_delete_module+0x0/0x2ec) from [] (ret_fast_syscall+0x0/0x48) r8:c0015028 r7:00000081 r6:b6fe0032 r5:70616d6f r4:0001a004 Code: e1a00005 eb0d9172 e7f001f2 e7f001f2 (e7f001f2) ---[ end trace 6a30b24d8c0cc2ee ]--- Segmentation fault --->8--- This error was introduced in 67ce04bf2746f8a1f8c2a104b313d20c63f68378 which was the first commit of this driver. Signed-off-by: Andreas Bießmann Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 196f9c71e5f46965325f31532f1d55c4a809f6ff Author: Shmulik Ladkani Date: Sun Jun 10 13:58:12 2012 +0300 mtd: nand: Use the mirror BBT descriptor when reading its version commit 7bb9c75436212813b38700c34df4bbb6eb82debe upstream. The code responsible for reading the version of the mirror bbt was incorrectly using the descriptor of the main bbt. Pass the mirror bbt descriptor to 'scan_read_raw' when reading the version of the mirror bbt. Signed-off-by: Shmulik Ladkani Acked-by: Sebastian Andrzej Siewior Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 70b22c795bbe61ab4935649bb6b4135f29d164bd Author: Richard Genoud Date: Wed Sep 12 14:26:26 2012 +0200 mtd: nandsim: bugfix: fail if overridesize is too big commit bb0a13a13411c4ce24c48c8ff3cdf7b48d237240 upstream. If override size is too big, the module was actually loaded instead of failing, because retval was not set. This lead to memory corruption with the use of the freed structs nandsim and nand_chip. Signed-off-by: Richard Genoud Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit 2640a717145425d59c066e7de9350719912bd418 Author: Alexander Shiyan Date: Wed Aug 15 20:28:05 2012 +0400 mtd: autcpu12-nvram: Fix compile breakage commit d1f55c680e5d021e7066f4461dd678d42af18898 upstream. Update driver autcpu12-nvram.c so it compiles; map_read32/map_write32 no longer exist in the kernel so the driver is totally broken. Additionally, map_info name passed to simple_map_init is incorrect. Signed-off-by: Alexander Shiyan Acked-by: Arnd Bergmann Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit e82df24e733c04385e8d8c32d3062292668bf8c0 Author: Huang Shijie Date: Sat Aug 18 13:07:41 2012 -0400 mtd: mtdpart: break it as soon as we parse out the partitions commit c51803ddba10d80d9f246066802c6e359cf1d44c upstream. We may cause a memory leak when the @types has more then one parser. Take the `default_mtd_part_types` for example. The default_mtd_part_types has two parsers now: `cmdlinepart` and `ofpart`. Assume the following case: The kernel command line sets the partitions like: #gpmi-nand:20m(boot),20m(kernel),1g(rootfs),-(user) But the devicetree file(such as arch/arm/boot/dts/imx28-evk.dts) also sets the same partitions as the kernel command line does. In the current code, the partitions parsed out by the `ofpart` will overwrite the @pparts which has already set by the `cmdlinepart` parser, and the the partitions parsed out by the `cmdlinepart` is missed. A memory leak occurs. So we should break the code as soon as we parse out the partitions, In actually, this patch makes a priority order between the parsers. If one parser has already parsed out the partitions successfully, it's no need to use another parser anymore. Signed-off-by: Huang Shijie Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse Signed-off-by: Greg Kroah-Hartman commit c62f9945efea31db203fd4fb77e830ddffdcabf6 Author: Srivatsa S. Bhat Date: Thu May 24 19:46:26 2012 +0530 CPU hotplug, cpusets, suspend: Don't modify cpusets during suspend/resume commit d35be8bab9b0ce44bed4b9453f86ebf64062721e upstream. In the event of CPU hotplug, the kernel modifies the cpusets' cpus_allowed masks as and when necessary to ensure that the tasks belonging to the cpusets have some place (online CPUs) to run on. And regular CPU hotplug is destructive in the sense that the kernel doesn't remember the original cpuset configurations set by the user, across hotplug operations. However, suspend/resume (which uses CPU hotplug) is a special case in which the kernel has the responsibility to restore the system (during resume), to exactly the same state it was in before suspend. In order to achieve that, do the following: 1. Don't modify cpusets during suspend/resume. At all. In particular, don't move the tasks from one cpuset to another, and don't modify any cpuset's cpus_allowed mask. So, simply ignore cpusets during the CPU hotplug operations that are carried out in the suspend/resume path. 2. However, cpusets and sched domains are related. We just want to avoid altering cpusets alone. So, to keep the sched domains updated, build a single sched domain (containing all active cpus) during each of the CPU hotplug operations carried out in s/r path, effectively ignoring the cpusets' cpus_allowed masks. (Since userspace is frozen while doing all this, it will go unnoticed.) 3. During the last CPU online operation during resume, build the sched domains by looking up the (unaltered) cpusets' cpus_allowed masks. That will bring back the system to the same original state as it was in before suspend. Ultimately, this will not only solve the cpuset problem related to suspend resume (ie., restores the cpusets to exactly what it was before suspend, by not touching it at all) but also speeds up suspend/resume because we avoid running cpuset update code for every CPU being offlined/onlined. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Peter Zijlstra Cc: Linus Torvalds Cc: Andrew Morton Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20120524141611.3692.20155.stgit@srivatsabhat.in.ibm.com Signed-off-by: Ingo Molnar Signed-off-by: Preeti U Murthy Signed-off-by: Greg Kroah-Hartman commit 0c9673834740afdfa7c172b1b19c6b3ec5bc3907 Author: Seiji Aguchi Date: Tue Jul 24 13:27:23 2012 +0000 efi: initialize efi.runtime_version to make query_variable_info/update_capsule workable commit d6cf86d8f23253225fe2a763d627ecf7dfee9dae upstream. A value of efi.runtime_version is checked before calling update_capsule()/query_variable_info() as follows. But it isn't initialized anywhere. static efi_status_t virt_efi_query_variable_info(u32 attr, u64 *storage_space, u64 *remaining_space, u64 *max_variable_size) { if (efi.runtime_version < EFI_2_00_SYSTEM_TABLE_REVISION) return EFI_UNSUPPORTED; This patch initializes a value of efi.runtime_version at boot time. Signed-off-by: Seiji Aguchi Acked-by: Matthew Garrett Signed-off-by: Matt Fleming Signed-off-by: Ivan Hu Signed-off-by: Greg Kroah-Hartman commit b4bec66863ab1c55ccc1a9b01988b8b6b5abac87 Author: Matthew Garrett Date: Thu Jul 26 18:00:00 2012 -0400 efi: Build EFI stub with EFI-appropriate options commit 9dead5bbb825d7c25c0400e61de83075046322d0 upstream. We can't assume the presence of the red zone while we're still in a boot services environment, so we should build with -fno-red-zone to avoid problems. Change the size of wchar at the same time to make string handling simpler. Signed-off-by: Matthew Garrett Signed-off-by: Matt Fleming Acked-by: Josh Boyer Signed-off-by: Greg Kroah-Hartman commit d9302b128e4640df707c99335b3f3ad0e13d0148 Author: Mel Gorman Date: Mon Oct 8 16:29:20 2012 -0700 mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() commit 00442ad04a5eac08a98255697c510e708f6082e2 upstream. Commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory barrier related damage v3") introduced a potential memory corruption. shmem_alloc_page() uses a pseudo vma and it has one significant unique combination, vma->vm_ops=NULL and vma->policy->flags & MPOL_F_SHARED. get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL and mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED. Therefore, when a cpuset update race occurs, alloc_pages_vma() falls in 'goto retry_cpuset' path, decrements the reference count and frees the policy prematurely. Signed-off-by: KOSAKI Motohiro Signed-off-by: Mel Gorman Reviewed-by: Christoph Lameter Cc: Josh Boyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 04ea8a8388992ad12bc1b32cc16f4220300ea167 Author: KOSAKI Motohiro Date: Mon Oct 8 16:29:19 2012 -0700 mempolicy: fix refcount leak in mpol_set_shared_policy() commit 63f74ca21f1fad36d075e063f06dcc6d39fe86b2 upstream. When shared_policy_replace() fails to allocate new->policy is not freed correctly by mpol_set_shared_policy(). The problem is that shared mempolicy code directly call kmem_cache_free() in multiple places where it is easy to make a mistake. This patch creates an sp_free wrapper function and uses it. The bug was introduced pre-git age (IOW, before 2.6.12-rc2). [mgorman@suse.de: Editted changelog] Signed-off-by: KOSAKI Motohiro Signed-off-by: Mel Gorman Reviewed-by: Christoph Lameter Cc: Josh Boyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 04a30bd9dccbeee2cc6b035e42a37e9be0fa8c6c Author: Mel Gorman Date: Mon Oct 8 16:29:17 2012 -0700 mempolicy: fix a race in shared_policy_replace() commit b22d127a39ddd10d93deee3d96e643657ad53a49 upstream. shared_policy_replace() use of sp_alloc() is unsafe. 1) sp_node cannot be dereferenced if sp->lock is not held and 2) another thread can modify sp_node between spin_unlock for allocating a new sp node and next spin_lock. The bug was introduced before 2.6.12-rc2. Kosaki's original patch for this problem was to allocate an sp node and policy within shared_policy_replace and initialise it when the lock is reacquired. I was not keen on this approach because it partially duplicates sp_alloc(). As the paths were sp->lock is taken are not that performance critical this patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc(). [kosaki.motohiro@jp.fujitsu.com: Original patch] Signed-off-by: Mel Gorman Acked-by: KOSAKI Motohiro Reviewed-by: Christoph Lameter Cc: Josh Boyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 33efe2910bf7865a7d1fba4b06c9fa4b6fda6856 Author: KOSAKI Motohiro Date: Mon Oct 8 16:29:16 2012 -0700 mempolicy: remove mempolicy sharing commit 869833f2c5c6e4dd09a5378cfc665ffb4615e5d2 upstream. Dave Jones' system call fuzz testing tool "trinity" triggered the following bug error with slab debugging enabled ============================================================================= BUG numa_policy (Not tainted): Poison overwritten ----------------------------------------------------------------------------- INFO: 0xffff880146498250-0xffff880146498250. First byte 0x6a instead of 0x6b INFO: Allocated in mpol_new+0xa3/0x140 age=46310 cpu=6 pid=32154 __slab_alloc+0x3d3/0x445 kmem_cache_alloc+0x29d/0x2b0 mpol_new+0xa3/0x140 sys_mbind+0x142/0x620 system_call_fastpath+0x16/0x1b INFO: Freed in __mpol_put+0x27/0x30 age=46268 cpu=6 pid=32154 __slab_free+0x2e/0x1de kmem_cache_free+0x25a/0x260 __mpol_put+0x27/0x30 remove_vma+0x68/0x90 exit_mmap+0x118/0x140 mmput+0x73/0x110 exit_mm+0x108/0x130 do_exit+0x162/0xb90 do_group_exit+0x4f/0xc0 sys_exit_group+0x17/0x20 system_call_fastpath+0x16/0x1b INFO: Slab 0xffffea0005192600 objects=27 used=27 fp=0x (null) flags=0x20000000004080 INFO: Object 0xffff880146498250 @offset=592 fp=0xffff88014649b9d0 The problem is that the structure is being prematurely freed due to a reference count imbalance. In the following case mbind(addr, len) should replace the memory policies of both vma1 and vma2 and thus they will become to share the same mempolicy and the new mempolicy will have the MPOL_F_SHARED flag. +-------------------+-------------------+ | vma1 | vma2(shmem) | +-------------------+-------------------+ | | addr addr+len alloc_pages_vma() uses get_vma_policy() and mpol_cond_put() pair for maintaining the mempolicy reference count. The current rule is that get_vma_policy() only increments refcount for shmem VMA and mpol_conf_put() only decrements refcount if the policy has MPOL_F_SHARED. In above case, vma1 is not shmem vma and vma->policy has MPOL_F_SHARED! The reference count will be decreased even though was not increased whenever alloc_page_vma() is called. This has been broken since commit [52cd3b07: mempolicy: rework mempolicy Reference Counting] in 2008. There is another serious bug with the sharing of memory policies. Currently, mempolicy rebind logic (it is called from cpuset rebinding) ignores a refcount of mempolicy and override it forcibly. Thus, any mempolicy sharing may cause mempolicy corruption. The bug was introduced by commit [68860ec1: cpusets: automatic numa mempolicy rebinding]. Ideally, the shared policy handling would be rewritten to either properly handle COW of the policy structures or at least reference count MPOL_F_SHARED based exclusively on information within the policy. However, this patch takes the easier approach of disabling any policy sharing between VMAs. Each new range allocated with sp_alloc will allocate a new policy, set the reference count to 1 and drop the reference count of the old policy. This increases the memory footprint but is not expected to be a major problem as mbind() is unlikely to be used for fine-grained ranges. It is also inefficient because it means we allocate a new policy even in cases where mbind_range() could use the new_policy passed to it. However, it is more straight-forward and the change should be invisible to the user. [mgorman@suse.de: Edited changelog] Reported-by: Dave Jones Cc: Christoph Lameter Reviewed-by: Christoph Lameter Signed-off-by: KOSAKI Motohiro Signed-off-by: Mel Gorman Cc: Josh Boyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 14cfd9f986d521f3f97139fa2ae5db67886059d1 Author: KOSAKI Motohiro Date: Mon Oct 8 16:29:14 2012 -0700 revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" commit 8d34694c1abf29df1f3c7317936b7e3e2e308d9b upstream. Commit 05f144a0d5c2 ("mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages") removed vma->vm_policy updates code but it is the purpose of mbind_range(). Now, mbind_range() is virtually a no-op and while it does not allow memory corruption it is not the right fix. This patch is a revert. [mgorman@suse.de: Edited changelog] Signed-off-by: KOSAKI Motohiro Signed-off-by: Mel Gorman Cc: Christoph Lameter Cc: Josh Boyer Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 160f00fefef1e5b72be0745236d2b65d14b99ed4 Author: Francois Romieu Date: Sat Oct 6 11:19:53 2012 +0200 r8169: 8168c and later require bit 0x20 to be set in Config2 for PME signaling. commit d387b427c973974dd619a33549c070ac5d0e089f upstream. The new 84xx stopped flying below the radars. Signed-off-by: Francois Romieu Cc: Hayes Wang Acked-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 34ffca41fc0a205bb00382be7bbefbf2fc63436d Author: Francois Romieu Date: Sat Oct 6 11:19:52 2012 +0200 r8169: Config1 is read-only on 8168c and later. commit 851e60221926a53344b4227879858bef841b0477 upstream. Suggested by Hayes. Signed-off-by: Francois Romieu Cc: Hayes Wang Acked-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f5260a7c89bf5146c0d52b036ac58eb192537a84 Author: Paul E. McKenney Date: Sat Sep 22 13:55:30 2012 -0700 rcu: Fix day-one dyntick-idle stall-warning bug commit a10d206ef1a83121ab7430cb196e0376a7145b22 upstream. Each grace period is supposed to have at least one callback waiting for that grace period to complete. However, if CONFIG_NO_HZ=n, an extra callback-free grace period is no big problem -- it will chew up a tiny bit of CPU time, but it will complete normally. In contrast, CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to sleep indefinitely, in turn indefinitely delaying completion of the callback-free grace period. Given that nothing is waiting on this grace period, this is also not a problem. That is, unless RCU CPU stall warnings are also enabled, as they are in recent kernels. In this case, if a CPU wakes up after at least one minute of inactivity, an RCU CPU stall warning will result. The reason that no one noticed until quite recently is that most systems have enough OS noise that they will never remain absolutely idle for a full minute. But there are some embedded systems with cut-down userspace configurations that consistently get into this situation. All this begs the question of exactly how a callback-free grace period gets started in the first place. This can happen due to the fact that CPUs do not necessarily agree on which grace period is in progress. If a CPU still believes that the grace period that just completed is still ongoing, it will believe that it has callbacks that need to wait for another grace period, never mind the fact that the grace period that they were waiting for just completed. This CPU can therefore erroneously decide to start a new grace period. Note that this can happen in TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system: Deadlock considerations mean that the CPU that detected the end of the grace period is not necessarily officially informed of this fact for some time. Once this CPU notices that the earlier grace period completed, it will invoke its callbacks. It then won't have any callbacks left. If no other CPU has any callbacks, we now have a callback-free grace period. This commit therefore makes CPUs check more carefully before starting a new grace period. This new check relies on an array of tail pointers into each CPU's list of callbacks. If the CPU is up to date on which grace periods have completed, it checks to see if any callbacks follow the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks follow the RCU_WAIT_TAIL segment. The reason that this works is that the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment as soon as the CPU is officially notified that the old grace period has ended. This change is to cpu_needs_another_gp(), which is called in a number of places. The only one that really matters is in rcu_start_gp(), where the root rcu_node structure's ->lock is held, which prevents any other CPU from starting or completing a grace period, so that the comparison that determines whether the CPU is missing the completion of a grace period is stable. Reported-by: Becky Bruce Reported-by: Subodh Nijsure Reported-by: Paul Walmsley Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Tested-by: Paul Walmsley Signed-off-by: Greg Kroah-Hartman commit 4a94001bd83f18ff3f3b998899d250d96f60a503 Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 score: Add missing RCU idle APIs on idle loop commit 0ee23fda59740767324b4340247ca41a2f498ca6 upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in scores's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: Chen Liqin Cc: Lennox Wu Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit 1b5124a305593deffd6186540546b8bfba429570 Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 m32r: Add missing RCU idle APIs on idle loop commit 48ae077cfce72591b8fc80e1dcc87806f86fed7f upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in the m32r's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: Hirokazu Takata Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit a3e9082a19d723e7be0bc51d851b5689eee67c9c Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 cris: Add missing RCU idle APIs on idle loop commit c633f9e788928e91ad11f44df29b47bbbe9550b0 upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in the Cris's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: Mikael Starvik Cc: Jesper Nilsson Cc: Cris Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit 3b30d3182fbf026578fbad09877d78c338562f4c Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 alpha: Add missing RCU idle APIs on idle loop commit 4c94cada48f7c660eca582be6032427a5e367117 upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in the Alpha's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Tested-by: Michael Cree Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Matt Turner Cc: alpha Cc: Paul E. McKenney Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit ce6ccabd21d356ac22973d275f122d3d8efc96d7 Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 m68k: Add missing RCU idle APIs on idle loop commit 5b57ba37e82a15f345a6a2eb8c01a2b2d94c5eeb upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in the m68k's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Acked-by: Geert Uytterhoeven Cc: m68k Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit c7cc1fa9594f1c7fe4de4418dbe9d85c3a4727ee Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 mn10300: Add missing RCU idle APIs on idle loop commit 5b0753a90b7a98bc613c3767e9263a1a76d4f900 upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in the mn10300's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: David Howells Cc: Koichi Yasutake Signed-off-by: Paul E. McKenney Acked-by: David Howells Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit c272355732aaf718ba4c05b0693c0d327ea0de45 Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 frv: Add missing RCU idle APIs on idle loop commit 41d8fe5bb3cf91ce2716ac8f43e4b40d802a99c8 upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in the Frv's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: David Howells Acked-by: David Howells Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit 2ddcad1f012eb832b5ad61f27888bf36eb495e83 Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 xtensa: Add missing RCU idle APIs on idle loop commit 11ad47a0edbd62bfc0547cfcdf227a911433f207 upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in the xtensa's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: Chris Zankel Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit d2d41b722e5faf418c1a660eaa602d06e50402eb Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 parisc: Add missing RCU idle APIs on idle loop commit fbe752188d5589e7fcbb8e79824e560f77dccc92 upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in the parisc's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: James E.J. Bottomley Cc: Helge Deller Cc: Parisc Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit 699d26ccee4e22e324468e94eb648cc89964a15d Author: Frederic Weisbecker Date: Wed Aug 22 17:27:34 2012 +0200 h8300: Add missing RCU idle APIs on idle loop commit b2fe1430d4115c74d007c825cb9dc3317f28bb16 upstream. In the old times, the whole idle task was considered as an RCU quiescent state. But as RCU became more and more successful overtime, some RCU read side critical section have been added even in the code of some architectures idle tasks, for tracing for example. So nowadays, rcu_idle_enter() and rcu_idle_exit() must be called by the architecture to tell RCU about the part in the idle loop that doesn't make use of rcu read side critical sections, typically the part that puts the CPU in low power mode. This is necessary for RCU to find the quiescent states in idle in order to complete grace periods. Add this missing pair of calls in the h8300's idle loop. Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: Yoshinori Sato Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit c5a3f2e9981b6da91e70765d0a958a0ac9d14a53 Author: Paul E. McKenney Date: Fri Aug 24 13:22:13 2012 -0700 ia64: Add missing RCU idle APIs on idle loop commit 93482f4ef1093f5961a63359a34612183d6beea0 upstream. Traditionally, the entire idle task served as an RCU quiescent state. But when RCU read side critical sections started appearing within the idle loop, this traditional strategy became untenable. The fix was to create new RCU APIs named rcu_idle_enter() and rcu_idle_exit(), which must be called by each architecture's idle loop so that RCU can tell when it is safe to ignore a given idle CPU. Unfortunately, this fix was never applied to ia64, a shortcoming remedied by this commit. Reported by: Tony Luck Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Tested by: Tony Luck Reviewed-by: Josh Triplett Signed-off-by: Greg Kroah-Hartman commit d59316ee086133955a993855655b24925c0e6b05 Author: Alex Deucher Date: Wed Sep 26 12:40:45 2012 -0400 drm/radeon: force MSIs on RS690 asics commit fb6ca6d154cdcd53e7f27f8dbba513830372699b upstream. There are so many quirks, lets just try and force this for all RS690s. See: https://bugs.freedesktop.org/show_bug.cgi?id=37679 Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 60a601fda771cf25df2b702dbba1a3e22e81fa2c Author: Alex Deucher Date: Wed Sep 26 12:31:45 2012 -0400 drm/radeon: Add MSI quirk for gateway RS690 commit 3a6d59df80897cc87812b6826d70085905bed013 upstream. Fixes another system on: https://bugs.freedesktop.org/show_bug.cgi?id=37679 Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 47c3ae0b3b5b360387602c49d7076e6114d05879 Author: Alex Deucher Date: Fri Sep 14 10:59:26 2012 -0400 drm/radeon: only adjust default clocks on NI GPUs commit 2e3b3b105ab3bb5b6a37198da4f193cd13781d13 upstream. SI asics store voltage information differently so we don't have a way to deal with it properly yet. Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 507130a9e351f5662785a8a09ccdc5687b01aabe Author: Chris Wilson Date: Mon Sep 17 09:38:03 2012 +0000 drm: Destroy the planes prior to destroying the associated CRTC commit 3184009c36da413724f283e3c7ac9cc60c623bc4 upstream. As during the plane cleanup, we wish to disable the hardware and so may modify state on the associated CRTC, that CRTC must continue to exist until we are finished. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54101 Signed-off-by: Chris Wilson Cc: Jesse Barnes Reviewed-by: Jesse Barnes Tested-by: lu hua Signed-off-by: Dave Airlie Signed-off-by: Greg Kroah-Hartman commit 3e0f62d7750436e35540df761dcc2d8b01a9a5bf Author: Marko Friedemann Date: Mon Sep 3 10:12:40 2012 +0200 ALSA: USB: Support for (original) Xbox Communicator commit c05fce586d4da2dfe0309bef3795a8586e967bc3 upstream. Added support for Xbox Communicator to USB quirks. Signed-off-by: Marko Friedemann Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 92c635bfcd86aa182b9a6734426e61e1420579d7 Author: David Henningsson Date: Thu Sep 20 10:20:41 2012 +0200 ALSA: usb - disable broken hw volume for Tenx TP6911 commit c10514394ef9e8de93a4ad8c8904d71dcd82c122 upstream. While going through Ubuntu bugs, I discovered this patch being posted and a confirmation that the patch works as expected. Finding out how the hw volume really works would be preferrable to just disabling the broken one, but this would be better than nothing. Credit: sndfnsdfin (qawsnews) BugLink: https://bugs.launchpad.net/bugs/559939 Signed-off-by: David Henningsson Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit c8ab5d973cd5439ae0223ca259f0e20963905bef Author: Herton Ronaldo Krzesinski Date: Thu Sep 27 10:38:14 2012 -0300 ALSA: hda/realtek - Fix detection of ALC271X codec commit 9f720bb9409ea5923361fbd3fdbc505ca36cf012 upstream. In commit af741c1 ("ALSA: hda/realtek - Call alc_auto_parse_customize_define() always after fixup"), alc_auto_parse_customize_define was moved after detection of ALC271X. The problem is that detection of ALC271X relies on spec->cdefine.platform_type, and it's set on alc_auto_parse_customize_define. Move the alc_auto_parse_customize_define and its required fixup setup before the block doing the ALC271X and other codec setup. BugLink: https://bugs.launchpad.net/bugs/1006690 Signed-off-by: Herton Ronaldo Krzesinski Reviewed-by: David Henningsson Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 4ca1a84f737e7f8166aed08ba87f5acc49120da9 Author: Omair Mohammed Abdullah Date: Sat Sep 29 12:24:05 2012 +0530 ALSA: aloop - add locking to timer access commit d4f1e48bd11e3df6a26811f7a1f06c4225d92f7d upstream. When the loopback timer handler is running, calling del_timer() (for STOP trigger) will not wait for the handler to complete before deactivating the timer. The timer gets rescheduled in the handler as usual. Then a subsequent START trigger will try to start the timer using add_timer() with a timer pending leading to a kernel panic. Serialize the calls to add_timer() and del_timer() using a spin lock to avoid this. Signed-off-by: Omair Mohammed Abdullah Signed-off-by: Vinod Koul Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit c3b9446604af20c426b37b8191d9d089b40f5899 Author: Andrea Arcangeli Date: Mon Oct 8 16:33:27 2012 -0700 mm: thp: fix pmd_present for split_huge_page and PROT_NONE with THP commit 027ef6c87853b0a9df53175063028edb4950d476 upstream. In many places !pmd_present has been converted to pmd_none. For pmds that's equivalent and pmd_none is quicker so using pmd_none is better. However (unless we delete pmd_present) we should provide an accurate pmd_present too. This will avoid the risk of code thinking the pmd is non present because it's under __split_huge_page_map, see the pmd_mknotpresent there and the comment above it. If the page has been mprotected as PROT_NONE, it would also lead to a pmd_present false negative in the same way as the race with split_huge_page. Because the PSE bit stays on at all times (both during split_huge_page and when the _PAGE_PROTNONE bit get set), we could only check for the PSE bit, but checking the PROTNONE bit too is still good to remember pmd_present must always keep PROT_NONE into account. This explains a not reproducible BUG_ON that was seldom reported on the lists. The same issue is in pmd_large, it would go wrong with both PROT_NONE and if it races with split_huge_page. Signed-off-by: Andrea Arcangeli Acked-by: Rik van Riel Cc: Johannes Weiner Cc: Hugh Dickins Cc: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 6a91e161a7f1b0125822b1693ea8fe46035b0f1b Author: Hugh Dickins Date: Mon Oct 8 16:33:14 2012 -0700 mm: fix invalidate_complete_page2() lock ordering commit ec4d9f626d5908b6052c2973f37992f1db52e967 upstream. In fuzzing with trinity, lockdep protested "possible irq lock inversion dependency detected" when isolate_lru_page() reenabled interrupts while still holding the supposedly irq-safe tree_lock: invalidate_inode_pages2 invalidate_complete_page2 spin_lock_irq(&mapping->tree_lock) clear_page_mlock isolate_lru_page spin_unlock_irq(&zone->lru_lock) isolate_lru_page() is correct to enable interrupts unconditionally: invalidate_complete_page2() is incorrect to call clear_page_mlock() while holding tree_lock, which is supposed to nest inside lru_lock. Both truncate_complete_page() and invalidate_complete_page() call clear_page_mlock() before taking tree_lock to remove page from radix_tree. I guess invalidate_complete_page2() preferred to test PageDirty (again) under tree_lock before committing to the munlock; but since the page has already been unmapped, its state is already somewhat inconsistent, and no worse if clear_page_mlock() moved up. Reported-by: Sasha Levin Deciphered-by: Andrew Morton Signed-off-by: Hugh Dickins Acked-by: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: Michel Lespinasse Cc: Ying Han Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 26fc07d8da4aa118cfe44776c2d2ddef8ef5f641 Author: Michal Hocko Date: Mon Oct 8 16:33:31 2012 -0700 hugetlb: do not use vma_hugecache_offset() for vma_prio_tree_foreach commit 36e4f20af833d1ce196e6a4ade05dc26c44652d1 upstream. Commit 0c176d52b0b2 ("mm: hugetlb: fix pgoff computation when unmapping page from vma") fixed pgoff calculation but it has replaced it by vma_hugecache_offset() which is not approapriate for offsets used for vma_prio_tree_foreach() because that one expects index in page units rather than in huge_page_shift. Johannes said: : The resulting index may not be too big, but it can be too small: assume : hpage size of 2M and the address to unmap to be 0x200000. This is regular : page index 512 and hpage index 1. If you have a VMA that maps the file : only starting at the second huge page, that VMAs vm_pgoff will be 512 but : you ask for offset 1 and miss it even though it does map the page of : interest. hugetlb_cow() will try to unmap, miss the vma, and retry the : cow until the allocation succeeds or the skipped vma(s) go away. Signed-off-by: Michal Hocko Acked-by: Hillf Danton Cc: Mel Gorman Cc: KAMEZAWA Hiroyuki Cc: Andrea Arcangeli Cc: David Rientjes Acked-by: Johannes Weiner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 70fb727d0900836b71ebf7a7badaa7b7c9068102 Author: Naoya Horiguchi Date: Mon Oct 8 16:33:47 2012 -0700 kpageflags: fix wrong KPF_THP on non-huge compound pages commit 7a71932d5676b7410ab64d149bad8bde6b0d8632 upstream. KPF_THP can be set on non-huge compound pages (like slab pages or pages allocated by drivers with __GFP_COMP) because PageTransCompound only checks PG_head and PG_tail. Obviously this is a bug and breaks user space applications which look for thp via /proc/kpageflags. This patch rules out setting KPF_THP wrongly by additionally checking PageLRU on the head pages. Signed-off-by: Naoya Horiguchi Acked-by: KOSAKI Motohiro Acked-by: David Rientjes Reviewed-by: Fengguang Wu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 62bd817182be053f1c6dde95748d34fd7be27bb0 Author: Mark Brown Date: Tue Jul 31 18:37:29 2012 +0100 ASoC: wm9712: Fix name of Capture Switch commit 689185b78ba6fbe0042f662a468b5565909dff7a upstream. Help UIs associate it with the matching gain control. Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit e87ba35f1fba828bb198bdf89704eb9583932da8 Author: Jan Kara Date: Wed Sep 26 21:52:20 2012 -0400 ext4: fix fdatasync() for files with only i_size changes commit b71fc079b5d8f42b2a52743c8d2f1d35d655b1c5 upstream. Code tracking when transaction needs to be committed on fdatasync(2) forgets to handle a situation when only inode's i_size is changed. Thus in such situations fdatasync(2) doesn't force transaction with new i_size to disk and that can result in wrong i_size after a crash. Fix the issue by updating inode's i_datasync_tid whenever its size is updated. Reported-by: Kristian Nielsen Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 7061315f57c1dfa81d3d9b2f0efb50ce2b6f3371 Author: Bernd Schubert Date: Wed Sep 26 21:24:57 2012 -0400 ext4: always set i_op in ext4_mknod() commit 6a08f447facb4f9e29fcc30fb68060bb5a0d21c2 upstream. ext4_special_inode_operations have their own ifdef CONFIG_EXT4_FS_XATTR to mask those methods. And ext4_iget also always sets it, so there is an inconsistency. Signed-off-by: Bernd Schubert Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit a4bf81c2bf891cbd46d6a69ba15dd7d4350b62f3 Author: Dmitry Monakhov Date: Wed Sep 26 12:32:54 2012 -0400 ext4: online defrag is not supported for journaled files commit f066055a3449f0e5b0ae4f3ceab4445bead47638 upstream. Proper block swap for inodes with full journaling enabled is truly non obvious task. In order to be on a safe side let's explicitly disable it for now. Signed-off-by: Dmitry Monakhov Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 67bec353575f1205425c07bd76dbe5dfeb950a1e Author: Dmitry Monakhov Date: Wed Sep 26 12:32:19 2012 -0400 ext4: move_extent code cleanup commit 03bd8b9b896c8e940f282f540e6b4de90d666b7c upstream. - Remove usless checks, because it is too late to check that inode != NULL at the moment it was referenced several times. - Double lock routines looks very ugly and locking ordering relays on order of i_ino, but other kernel code rely on order of pointers. Let's make them simple and clean. - check that inodes belongs to the same SB as soon as possible. Signed-off-by: Dmitry Monakhov Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit d49765a2110de8ac18aa74373ae89c1e4484b5d6 Author: Herton Ronaldo Krzesinski Date: Sun Sep 23 22:49:12 2012 -0400 ext4: fix crash when accessing /proc/mounts concurrently commit 50df9fd55e4271e89a7adf3b1172083dd0ca199d upstream. The crash was caused by a variable being erronously declared static in token2str(). In addition to /proc/mounts, the problem can also be easily replicated by accessing /proc/fs/ext4//options in parallel: $ cat /proc/fs/ext4//options > options.txt ... and then running the following command in two different terminals: $ while diff /proc/fs/ext4//options options.txt; do true; done This is also the cause of the following a crash while running xfstests #234, as reported in the following bug reports: https://bugs.launchpad.net/bugs/1053019 https://bugzilla.kernel.org/show_bug.cgi?id=47731 Signed-off-by: Herton Ronaldo Krzesinski Signed-off-by: "Theodore Ts'o" Cc: Brad Figg Signed-off-by: Greg Kroah-Hartman commit 78790d120fd2a6ffa39dc23e57fdf9133e67b14c Author: Theodore Ts'o Date: Wed Sep 19 22:42:36 2012 -0400 ext4: fix potential deadlock in ext4_nonda_switch() commit 00d4e7362ed01987183e9528295de3213031309c upstream. In ext4_nonda_switch(), if the file system is getting full we used to call writeback_inodes_sb_if_idle(). The problem is that we can be holding i_mutex already, and this causes a potential deadlock when writeback_inodes_sb_if_idle() when it tries to take s_umount. (See lockdep output below). As it turns out we don't need need to hold s_umount; the fact that we are in the middle of the write(2) system call will keep the superblock pinned. Unfortunately writeback_inodes_sb() checks to make sure s_umount is taken, and the VFS uses a different mechanism for making sure the file system doesn't get unmounted out from under us. The simplest way of dealing with this is to just simply grab s_umount using a trylock, and skip kicking the writeback flusher thread in the very unlikely case that we can't take a read lock on s_umount without blocking. Also, we now check the cirteria for kicking the writeback thread before we decide to whether to fall back to non-delayed writeback, so if there are any outstanding delayed allocation writes, we try to get them resolved as soon as possible. [ INFO: possible circular locking dependency detected ] 3.6.0-rc1-00042-gce894ca #367 Not tainted ------------------------------------------------------- dd/8298 is trying to acquire lock: (&type->s_umount_key#18){++++..}, at: [] writeback_inodes_sb_if_idle+0x28/0x46 but task is already holding lock: (&sb->s_type->i_mutex_key#8){+.+...}, at: [] generic_file_aio_write+0x5f/0xd3 which lock already depends on the new lock. 2 locks held by dd/8298: #0: (sb_writers#2){.+.+.+}, at: [] generic_file_aio_write+0x56/0xd3 #1: (&sb->s_type->i_mutex_key#8){+.+...}, at: [] generic_file_aio_write+0x5f/0xd3 stack backtrace: Pid: 8298, comm: dd Not tainted 3.6.0-rc1-00042-gce894ca #367 Call Trace: [] ? console_unlock+0x345/0x372 [] print_circular_bug+0x190/0x19d [] __lock_acquire+0x86d/0xb6c [] ? mark_held_locks+0x5c/0x7b [] lock_acquire+0x66/0xb9 [] ? writeback_inodes_sb_if_idle+0x28/0x46 [] down_read+0x28/0x58 [] ? writeback_inodes_sb_if_idle+0x28/0x46 [] writeback_inodes_sb_if_idle+0x28/0x46 [] ext4_nonda_switch+0xe1/0xf4 [] ext4_da_write_begin+0x27/0x193 [] generic_file_buffered_write+0xc8/0x1bb [] __generic_file_aio_write+0x1dd/0x205 [] generic_file_aio_write+0x78/0xd3 [] ext4_file_write+0x480/0x4a6 [] ? __lock_acquire+0x41e/0xb6c [] ? sched_clock_cpu+0x11a/0x13e [] ? trace_hardirqs_off+0xb/0xd [] ? local_clock+0x37/0x4e [] do_sync_write+0x67/0x9d [] ? wait_on_retry_sync_kiocb+0x44/0x44 [] vfs_write+0x7b/0xe6 [] sys_write+0x3b/0x64 [] syscall_call+0x7/0xb Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 2de1ece4de710668bb53eff74bffe80bb267087f Author: Yongqiang Yang Date: Wed Sep 5 01:27:50 2012 -0400 ext4: avoid duplicate writes of the backup bg descriptor blocks commit 2ebd1704ded88a8ae29b5f3998b13959c715c4be upstream. The resize code was needlessly writing the backup block group descriptor blocks multiple times (once per block group) during an online resize. Signed-off-by: Yongqiang Yang Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 1079d76f47379886ebfd3ae0f79a9f6cd62df29e Author: Yongqiang Yang Date: Wed Sep 5 01:25:50 2012 -0400 ext4: don't copy non-existent gdt blocks when resizing commit 6df935ad2fced9033ab52078825fcaf6365f34b7 upstream. The resize code was copying blocks at the beginning of each block group in order to copy the superblock and block group descriptor table (gdt) blocks. This was, unfortunately, being done even for block groups that did not have super blocks or gdt blocks. This is a complete waste of perfectly good I/O bandwidth, to skip writing those blocks for sparse bg's. Signed-off-by: Yongqiang Yang Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 2d5a1fbc0c25ab4a2e2770a5c729467893407009 Author: Yongqiang Yang Date: Wed Sep 5 01:21:50 2012 -0400 ext4: ignore last group w/o enough space when resizing instead of BUG'ing commit 03c1c29053f678234dbd51bf3d65f3b7529021de upstream. If the last group does not have enough space for group tables, ignore it instead of calling BUG_ON(). Reported-by: Daniel Drake Signed-off-by: Yongqiang Yang Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit b790ef23cd1fd3fece4777e90d85a945c4820ae7 Author: Yinghai Lu Date: Mon Sep 10 17:19:33 2012 -0700 PCI: Check P2P bridge for invalid secondary/subordinate range commit 1965f66e7db08d1ebccd24a59043eba826cc1ce8 upstream. For bridges with "secondary > subordinate", i.e., invalid bus number apertures, we don't enumerate anything behind the bridge unless the user specified "pci=assign-busses". This patch makes us automatically try to reassign the downstream bus numbers in this case (just for that bridge, not for all bridges as "pci=assign-busses" does). We don't discover all the devices on the Intel DP43BF motherboard without this change (or "pci=assign-busses") because its BIOS configures a bridge as: pci 0000:00:1e.0: PCI bridge to [bus 20-08] (subtractive decode) [bhelgaas: changelog, change message to dev_info] Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=18412 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=625754 Reported-by: Brian C. Huffman Reported-by: VL Tested-by: VL Signed-off-by: Yinghai Lu Signed-off-by: Bjorn Helgaas commit 82e0551186fd455d792fccae805470f222339c06 Author: Martin Peschke Date: Tue Sep 4 15:23:36 2012 +0200 SCSI: zfcp: only access zfcp_scsi_dev for valid scsi_device commit d436de8ce25f53a8a880a931886821f632247943 upstream. __scsi_remove_device (e.g. due to dev_loss_tmo) calls zfcp_scsi_slave_destroy which in turn sends a close LUN FSF request to the adapter. After 30 seconds without response, zfcp_erp_timeout_handler kicks the ERP thread failing the close LUN ERP action. zfcp_erp_wait in zfcp_erp_lun_shutdown_wait and thus zfcp_scsi_slave_destroy returns and then scsi_device is no longer valid. Sometime later the response to the close LUN FSF request may finally come in. However, commit b62a8d9b45b971a67a0f8413338c230e3117dff5 "[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit" introduced a number of attempts to unconditionally access struct zfcp_scsi_dev through struct scsi_device causing a use-after-free. This leads to an Oops due to kernel page fault in one of: zfcp_fsf_abort_fcp_command_handler, zfcp_fsf_open_lun_handler, zfcp_fsf_close_lun_handler, zfcp_fsf_req_trace, zfcp_fsf_fcp_handler_common. Move dereferencing of zfcp private data zfcp_scsi_dev allocated in scsi_device via scsi_transport_reserve_device after the check for potentially aborted FSF request and thus no longer valid scsi_device. Only then assign sdev_to_zfcp(sdev) to the local auto variable struct zfcp_scsi_dev *zfcp_sdev. Signed-off-by: Martin Peschke Signed-off-by: Steffen Maier Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 960e8a5e0257c26bb5d5b14c8c89919e1293e76c Author: Steffen Maier Date: Tue Sep 4 15:23:34 2012 +0200 SCSI: zfcp: restore refcount check on port_remove commit d99b601b63386f3395dc26a699ae703a273d9982 upstream. Upstream commit f3450c7b917201bb49d67032e9f60d5125675d6a "[SCSI] zfcp: Replace local reference counting with common kref" accidentally dropped a reference count check before tearing down zfcp_ports that are potentially in use by zfcp_units. Even remote ports in use can be removed causing unreachable garbage objects zfcp_ports with zfcp_units. Thus units won't come back even after a manual port_rescan. The kref of zfcp_port->dev.kobj is already used by the driver core. We cannot re-use it to track the number of zfcp_units. Re-introduce our own counter for units per port and check on port_remove. Signed-off-by: Steffen Maier Reviewed-by: Heiko Carstens Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit ab37eb284ac88f092187f7c4ee4a42c28cdaf3e5 Author: Julia Lawall Date: Tue Sep 4 15:23:33 2012 +0200 SCSI: zfcp: remove invalid reference to list iterator variable commit ca579c9f136af4274ccfd1bcaee7f38a29a0e2e9 upstream. If list_for_each_entry, etc complete a traversal of the list, the iterator variable ends up pointing to an address at an offset from the list head, and not a meaningful structure. Thus this value should not be used after the end of the iterator. Replace port->adapter->scsi_host by adapter->scsi_host. This problem was found using Coccinelle (http://coccinelle.lip6.fr/). Oversight in upsteam commit of v2.6.37 a1ca48319a9aa1c5b57ce142f538e76050bb8972 "[SCSI] zfcp: Move ACL/CFDC code to zfcp_cfdc.c" which merged the content of zfcp_erp_port_access_changed(). Signed-off-by: Julia Lawall Signed-off-by: Steffen Maier Reviewed-by: Martin Peschke Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 530bab7e87951aea1ae20706174021c3355f305a Author: Steffen Maier Date: Tue Sep 4 15:23:32 2012 +0200 SCSI: zfcp: Do not wakeup while suspended commit cb45214960bc989af8b911ebd77da541c797717d upstream. If the mapping of FCP device bus ID and corresponding subchannel is modified while the Linux image is suspended, the resume of FCP devices can fail. During resume, zfcp gets callbacks from cio regarding the modified subchannels but they can be arbitrarily mixed with the restore/resume callback. Since the cio callbacks would trigger adapter recovery, zfcp could wakeup before the resume callback. Therefore, ignore the cio callbacks regarding subchannels while being suspended. We can safely do so, since zfcp does not deal itself with subchannels. For problem determination purposes, we still trace the ignored callback events. The following kernel messages could be seen on resume: kernel: : parent should not be sleeping As part of adapter reopen recovery, zfcp performs auto port scanning which can erroneously try to register new remote ports with scsi_transport_fc and the device core code complains about the parent (adapter) still sleeping. kernel: zfcp.3dff9c: :\ Setting up the QDIO connection to the FCP adapter failed kernel: zfcp.574d43: :\ ERP cannot recover an error on the FCP device In such cases, the adapter gave up recovery and remained blocked along with its child objects: remote ports and LUNs/scsi devices. Even the adapter shutdown as part of giving up recovery failed because the ccw device state remained disconnected. Later, the corresponding remote ports ran into dev_loss_tmo. As a result, the LUNs were erroneously not available again after resume. Even a manually triggered adapter recovery (e.g. sysfs attribute failed, or device offline/online via sysfs) could not recover the adapter due to the remaining disconnected state of the corresponding ccw device. Signed-off-by: Steffen Maier Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 2872405ba2729853da43865893469f8cb80b69b2 Author: Steffen Maier Date: Tue Sep 4 15:23:31 2012 +0200 SCSI: zfcp: Bounds checking for deferred error trace commit 01e60527f0a49b3d7df603010bd6079bb4b6cf07 upstream. The pl vector has scount elements, i.e. pl[scount-1] is the last valid element. For maximum sized requests, payload->counter == scount after the last loop iteration. Therefore, do bounds checking first (with boolean shortcut) to not access the invalid element pl[scount]. Do not trust the maximum sbale->scount value from the HBA but ensure we won't access the pl vector out of our allocated bounds. While at it, clean up scoping and prevent unnecessary memset. Minor fix for 86a9668a8d29ea711613e1cb37efa68e7c4db564 "[SCSI] zfcp: support for hardware data router" Signed-off-by: Steffen Maier Reviewed-by: Martin Peschke Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 1afbcbdd1c0761ca9e807e78fa106efc465c3925 Author: Steffen Maier Date: Tue Sep 4 15:23:30 2012 +0200 SCSI: zfcp: Make trace record tags unique commit 0100998dbfe6dfcd90a6e912ca7ed6f255d48f25 upstream. Duplicate fssrh_2 from a54ca0f62f953898b05549391ac2a8a4dad6482b "[SCSI] zfcp: Redesign of the debug tracing for HBA records." complicates distinction of generic status read response from local link up. Duplicate fsscth1 from 2c55b750a884b86dea8b4cc5f15e1484cc47a25c "[SCSI] zfcp: Redesign of the debug tracing for SAN records." complicates distinction of good common transport response from invalid port handle. Signed-off-by: Steffen Maier Reviewed-by: Martin Peschke Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit ab617446730e46080235b9e158d364bab233e9ee Author: Steffen Maier Date: Tue Sep 4 15:23:29 2012 +0200 SCSI: zfcp: Adapt to new FC_PORTSPEED semantics commit d22019778cd9ea04c1dadf7bf453920d5288f8d9 upstream. Commit a9277e7783651d4e0a849f7988340b1c1cf748a4 "[SCSI] scsi_transport_fc: Getting FC Port Speed in sync with FC-GS" changed the semantics of FC_PORTSPEED defines to FDMI port attributes of FC-HBA/SM-HBA which is different from the previous bit reversed Report Port Speed Capabilities (RPSC) ELS of FC-GS/FC-LS. Zfcp showed "10 Gbit" instead of "4 Gbit" for supported_speeds. It now uses explicit bit conversion as the other LLDs already do, in order to be independent of the kernel bit semantics. See also http://marc.info/?l=linux-scsi&m=134452926830730&w=2 Signed-off-by: Steffen Maier Reviewed-by: Martin Peschke Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 194cca171bc5a062974d96b0246cde0c65ac33d0 Author: Florian Zumbiehl Date: Tue Oct 2 12:20:37 2012 +0000 drm/savage: re-add busmaster enable, regression fix commit df86b5765a48d5f557489577652bd6df145b0e1b upstream. 466e69b8b03b8c1987367912782bc12988ad8794 dropped busmaster enable from the global drm code and moved it to the individual drivers, but missed the savage driver. So, this re-adds busmaster enable to the savage driver, fixing the regression. Signed-off-by: Florian Zumbiehl Reviewed-by: Alex Deucher Signed-off-by: Dave Airlie Signed-off-by: Greg Kroah-Hartman commit fa9a48a7bed7d785a98356d4d06090500ec00c82 Author: Ed Cashin Date: Wed Sep 19 15:46:39 2012 +0000 aoe: assert AoE packets marked as requiring no checksum [ Upstream commit 8babe8cc6570ed896b7b596337eb8fe730c3ff45 ] In order for the network layer to see that AoE requires no checksumming in a generic way, the packets must be marked as requiring no checksum, so we make this requirement explicit with the assertion. Signed-off-by: Ed Cashin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2583c9724757046ed749ec7d1fef2e371b02f203 Author: Ed Cashin Date: Wed Sep 19 15:49:00 2012 +0000 net: do not disable sg for packets requiring no checksum [ Upstream commit c0d680e577ff171e7b37dbdb1b1bf5451e851f04 ] A change in a series of VLAN-related changes appears to have inadvertently disabled the use of the scatter gather feature of network cards for transmission of non-IP ethernet protocols like ATA over Ethernet (AoE). Below is a reference to the commit that introduces a "harmonize_features" function that turns off scatter gather when the NIC does not support hardware checksumming for the ethernet protocol of an sk buff. commit f01a5236bd4b140198fbcc550f085e8361fd73fa Author: Jesse Gross Date: Sun Jan 9 06:23:31 2011 +0000 net offloading: Generalize netif_get_vlan_features(). The can_checksum_protocol function is not equipped to consider a protocol that does not require checksumming. Calling it for a protocol that requires no checksum is inappropriate. The patch below has harmonize_features call can_checksum_protocol when the protocol needs a checksum, so that the network layer is not forced to perform unnecessary skb linearization on the transmission of AoE packets. Unnecessary linearization results in decreased performance and increased memory pressure, as reported here: http://www.spinics.net/lists/linux-mm/msg15184.html The problem has probably not been widely experienced yet, because only recently has the kernel.org-distributed aoe driver acquired the ability to use payloads of over a page in size, with the patchset recently included in the mm tree: https://lkml.org/lkml/2012/8/28/140 The coraid.com-distributed aoe driver already could use payloads of greater than a page in size, but its users generally do not use the newest kernels. Signed-off-by: Ed Cashin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 782d596c84bf4239b24906132ba6367ca7dca865 Author: Alan Cox Date: Tue Sep 4 04:13:18 2012 +0000 netrom: copy_datagram_iovec can fail [ Upstream commit 6cf5c951175abcec4da470c50565cc0afe6cd11d ] Check for an error from this and if so bail properly. Signed-off-by: Alan Cox Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit bc0b2168aed1ecf0d71975af12d4f0ffecb92bfc Author: Eric Dumazet Date: Tue Sep 4 15:54:55 2012 -0400 l2tp: fix a typo in l2tp_eth_dev_recv() [ Upstream commit c0cc88a7627c333de50b07b7c60b1d49d9d2e6cc ] While investigating l2tp bug, I hit a bug in eth_type_trans(), because not enough bytes were pulled in skb head. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 28ad5c792deb17e1274cef32c59049d2062ed1b3 Author: Eric Dumazet Date: Tue Sep 25 22:01:28 2012 +0200 ipv6: mip6: fix mip6_mh_filter() [ Upstream commit 96af69ea2a83d292238bdba20e4508ee967cf8cb ] mip6_mh_filter() should not modify its input, or else its caller would need to recompute ipv6_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7a20f9c5fa76e602bf9dda7f610c8de04e7afa04 Author: Eric Dumazet Date: Tue Sep 25 07:03:40 2012 +0000 ipv6: raw: fix icmpv6_filter() [ Upstream commit 1b05c4b50edbddbdde715c4a7350629819f6655e ] icmpv6_filter() should not modify its input, or else its caller would need to recompute ipv6_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() and change the prototype to make clear both sk and skb are const. Also, if icmpv6 header cannot be found, do not deliver the packet, as we do in IPv4. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 514ddfedb89c19c57de82aedec8da2bd8ff3802c Author: Eric Dumazet Date: Sat Sep 22 00:08:29 2012 +0000 ipv4: raw: fix icmp_filter() [ Upstream commit ab43ed8b7490cb387782423ecf74aeee7237e591 ] icmp_filter() should not modify its input, or else its caller would need to recompute ip_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() and change the prototype to make clear both sk and skb are const. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 09c6cf7f980f1e8dbf58dc9ae0ebf4f6eb93cc0d Author: Eric Dumazet Date: Mon Sep 24 07:00:11 2012 +0000 net: guard tcp_set_keepalive() to tcp sockets [ Upstream commit 3e10986d1d698140747fcfc2761ec9cb64c1d582 ] Its possible to use RAW sockets to get a crash in tcp_set_keepalive() / sk_reset_timer() Fix is to make sure socket is a SOCK_STREAM one. Reported-by: Dave Jones Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6b8fc5c4eba92b5cd3c9ca0d926e99831604f81e Author: Chema Gonzalez Date: Fri Sep 7 13:40:50 2012 +0000 net: small bug on rxhash calculation [ Upstream commit 6862234238e84648c305526af2edd98badcad1e0 ] In the current rxhash calculation function, while the sorting of the ports/addrs is coherent (you get the same rxhash for packets sharing the same 4-tuple, in both directions), ports and addrs are sorted independently. This implies packets from a connection between the same addresses but crossed ports hash to the same rxhash. For example, traffic between A=S:l and B=L:s is hashed (in both directions) from {L, S, {s, l}}. The same rxhash is obtained for packets between C=S:s and D=L:l. This patch ensures that you either swap both addrs and ports, or you swap none. Traffic between A and B, and traffic between C and D, get their rxhash from different sources ({L, S, {l, s}} for A<->B, and {L, S, {s, l}} for C<->D) The patch is co-written with Eric Dumazet Signed-off-by: Chema Gonzalez Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e043257dde697ded17ed99f280cdb7643fdc007a Author: Xiaodong Xu Date: Sat Sep 22 00:09:32 2012 +0000 pppoe: drop PPPOX_ZOMBIEs in pppoe_release [ Upstream commit 2b018d57ff18e5405823e5cb59651a5b4d946d7b ] When PPPOE is running over a virtual ethernet interface (e.g., a bonding interface) and the user tries to delete the interface in case the PPPOE state is ZOMBIE, the kernel will loop forever while unregistering net_device for the reference count is not decreased to zero which should have been done with dev_put(). Signed-off-by: Xiaodong Xu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8d16c6268b7c3af2ce4f58de903588489e037fcf Author: Thomas Graf Date: Mon Sep 3 04:27:42 2012 +0000 sctp: Don't charge for data in sndbuf again when transmitting packet [ Upstream commit 4c3a5bdae293f75cdf729c6c00124e8489af2276 ] SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and via skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken up by __sctp_write_space(). Buffer space charged via sctp_set_owner_w() is released in sctp_wfree() which calls __sctp_write_space() directly. Buffer space charged via skb_set_owner_w() is released via sock_wfree() which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set. sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets. Therefore if sctp_packet_transmit() manages to queue up more than sndbuf bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is interrupted by a signal. This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ... Charging for the data twice does not make sense in the first place, it leads to overcharging sndbuf by a factor 2. Therefore this patch only charges a single byte in wmem_alloc when transmitting an SCTP packet to ensure that the socket stays alive until the packet has been released. This means that control chunks are no longer accounted for in wmem_alloc which I believe is not a problem as skb->truesize will typically lead to overcharging anyway and thus compensates for any control overhead. Signed-off-by: Thomas Graf CC: Vlad Yasevich CC: Neil Horman CC: David Miller Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2033554a2fe3c5e54764f2f1dba0baff3261f8b5 Author: Michal Kubeček Date: Fri Sep 14 04:59:52 2012 +0000 tcp: flush DMA queue before sk_wait_data if rcv_wnd is zero [ Upstream commit 15c041759bfcd9ab0a4e43f1c16e2644977d0467 ] If recv() syscall is called for a TCP socket so that - IOAT DMA is used - MSG_WAITALL flag is used - requested length is bigger than sk_rcvbuf - enough data has already arrived to bring rcv_wnd to zero then when tcp_recvmsg() gets to calling sk_wait_data(), receive window can be still zero while sk_async_wait_queue exhausts enough space to keep it zero. As this queue isn't cleaned until the tcp_service_net_dma() call, sk_wait_data() cannot receive any data and blocks forever. If zero receive window and non-empty sk_async_wait_queue is detected before calling sk_wait_data(), process the queue first. Signed-off-by: Michal Kubecek Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 410eafac650a906e990351a01ec70451064df83d Author: Wei Yongjun Date: Thu Sep 20 18:29:56 2012 +0000 ipv6: fix return value check in fib6_add() [ Upstream commit f950c0ecc78f745e490d615280e031de4dbb1306 ] In case of error, the function fib6_add_1() returns ERR_PTR() or NULL pointer. The ERR_PTR() case check is missing in fib6_add(). dpatch engine is used to generated this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d5e36b089edcc8179d4640e1a8e5bca6fb74409e Author: Nicolas Dichtel Date: Wed Sep 26 00:04:55 2012 +0000 ipv6: del unreachable route when an addr is deleted on lo [ Upstream commit 64c6d08e6490fb18cea09bb03686c149946bd818 ] When an address is added on loopback (ip -6 a a 2002::1/128 dev lo), two routes are added: - one in the local table: local 2002::1 via :: dev lo proto none metric 0 - one the in main table (for the prefix): unreachable 2002::1 dev lo proto kernel metric 256 error -101 When the address is deleted, the route inserted in the main table remains because we use rt6_lookup(), which returns NULL when dst->error is set, which is the case here! Thus, it is better to use ip6_route_lookup() to avoid this kind of filter. Signed-off-by: Nicolas Dichtel Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 17de307472bf21479e6d7c35211204b6ea186a7c Author: Gao feng Date: Wed Sep 19 19:25:34 2012 +0000 ipv6: release reference of ip6_null_entry's dst entry in __ip6_del_rt [ Upstream commit 6825a26c2dc21eb4f8df9c06d3786ddec97cf53b ] as we hold dst_entry before we call __ip6_del_rt, so we should alse call dst_release not only return -ENOENT when the rt6_info is ip6_null_entry. and we already hold the dst entry, so I think it's safe to call dst_release out of the write-read lock. Signed-off-by: Gao feng Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2ab08687cf48805c5abd0f9a785e09181eda9492 Author: Antonio Quartulli Date: Tue Oct 2 06:14:17 2012 +0000 8021q: fix mac_len recomputation in vlan_untag() [ Upstream commit 5316cf9a5197eb80b2800e1acadde287924ca975 ] skb_reset_mac_len() relies on the value of the skb->network_header pointer, therefore we must wait for such pointer to be recalculated before computing the new mac_len value. Signed-off-by: Antonio Quartulli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 97d5d3295198279362552d9b810c088d3410da23 Author: Lennart Sorensen Date: Fri Sep 7 12:14:02 2012 +0000 sierra_net: Endianess bug fix. [ Upstream commit 2120c52da6fe741454a60644018ad2a6abd957ac ] I discovered I couldn't get sierra_net to work on a powerpc. Turns out the firmware attribute check assumes the system is little endian and hence fails because the attributes is a 16 bit value. Signed-off-by: Len Sorensen Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5ee708f19bd6a1da1d7bab5916382bbbfba4edbb Author: Paolo Valente Date: Sat Sep 15 00:41:35 2012 +0000 pkt_sched: fix virtual-start-time update in QFQ [ Upstream commit 71261956973ba9e0637848a5adb4a5819b4bae83 ] If the old timestamps of a class, say cl, are stale when the class becomes active, then QFQ may assign to cl a much higher start time than the maximum value allowed. This may happen when QFQ assigns to the start time of cl the finish time of a group whose classes are characterized by a higher value of the ratio max_class_pkt/weight_of_the_class with respect to that of cl. Inserting a class with a too high start time into the bucket list corrupts the data structure and may eventually lead to crashes. This patch limits the maximum start time assigned to a class. Signed-off-by: Paolo Valente Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 52ee75479f2aea816d8bb6a9d6caf5c1ebb36724 Author: Eric Dumazet Date: Tue Sep 11 13:11:12 2012 +0000 net-sched: sch_cbq: avoid infinite loop [ Upstream commit bdfc87f7d1e253e0a61e2fc6a75ea9d76f7fc03a ] Its possible to setup a bad cbq configuration leading to an infinite loop in cbq_classify() DEV_OUT=eth0 ICMP="match ip protocol 1 0xff" U32="protocol ip u32" DST="match ip dst" tc qdisc add dev $DEV_OUT root handle 1: cbq avpkt 1000 \ bandwidth 100mbit tc class add dev $DEV_OUT parent 1: classid 1:1 cbq \ rate 512kbit allot 1500 prio 5 bounded isolated tc filter add dev $DEV_OUT parent 1: prio 3 $U32 \ $ICMP $DST 192.168.3.234 flowid 1: Reported-by: Denys Fedoryschenko Tested-by: Denys Fedoryschenko Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6720119023f635bd8c285530a7092716c23bdfcc Author: Nikolay Aleksandrov Date: Fri Sep 14 05:50:03 2012 +0000 netxen: check for root bus in netxen_mask_aer_correctable [ Upstream commit e4d1aa40e363ed3e0486aeeeb0d173f7f822737e ] Add a check if pdev->bus->self == NULL (root bus). When attaching a netxen NIC to a VM it can be on the root bus and the guest would crash in netxen_mask_aer_correctable() because of a NULL pointer dereference if CONFIG_PCIEAER is present. Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ec7fd138e2e690efff5eaec279ebc2c4740589e4 Author: Florian Fainelli Date: Mon Sep 10 14:06:58 2012 +0200 ixp4xx_hss: fix build failure due to missing linux/module.h inclusion [ Upstream commit 0b836ddde177bdd5790ade83772860940bd481ea ] Commit 36a1211970193ce215de50ed1e4e1272bc814df1 (netprio_cgroup.h: dont include module.h from other includes) made the following build error on ixp4xx_hss pop up: CC [M] drivers/net/wan/ixp4xx_hss.o drivers/net/wan/ixp4xx_hss.c:1412:20: error: expected ';', ',' or ')' before string constant drivers/net/wan/ixp4xx_hss.c:1413:25: error: expected ';', ',' or ')' before string constant drivers/net/wan/ixp4xx_hss.c:1414:21: error: expected ';', ',' or ')' before string constant drivers/net/wan/ixp4xx_hss.c:1415:19: error: expected ';', ',' or ')' before string constant make[8]: *** [drivers/net/wan/ixp4xx_hss.o] Error 1 This was previously hidden because ixp4xx_hss includes linux/hdlc.h which includes linux/netdevice.h which includes linux/netprio_cgroup.h which used to include linux/module.h. The real issue was actually present since the initial commit that added this driver since it uses macros from linux/module.h without including this file. Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f7ee46fb594cf2c3dd3a96457a028336fb34d5a6 Author: htbegin Date: Mon Oct 1 16:42:43 2012 +0000 net: ethernet: davinci_cpdma: decrease the desc count when cleaning up the remaining packets [ Upstream commit ffb5ba90017505a19e238e986e6d33f09e4df765 ] chan->count is used by rx channel. If the desc count is not updated by the clean up loop in cpdma_chan_stop, the value written to the rxfree register in cpdma_chan_start will be incorrect. Signed-off-by: Tao Hou Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 53bf1469924e07385b4493d3cbd78551d4afaaa3 Author: Mathias Krause Date: Thu Sep 20 10:01:49 2012 +0000 xfrm_user: ensure user supplied esn replay window is valid [ Upstream commit ecd7918745234e423dd87fcc0c077da557909720 ] The current code fails to ensure that the netlink message actually contains as many bytes as the header indicates. If a user creates a new state or updates an existing one but does not supply the bytes for the whole ESN replay window, the kernel copies random heap bytes into the replay bitmap, the ones happen to follow the XFRMA_REPLAY_ESN_VAL netlink attribute. This leads to following issues: 1. The replay window has random bits set confusing the replay handling code later on. 2. A malicious user could use this flaw to leak up to ~3.5kB of heap memory when she has access to the XFRM netlink interface (requires CAP_NET_ADMIN). Known users of the ESN replay window are strongSwan and Steffen's iproute2 patch (). The latter uses the interface with a bitmap supplied while the former does not. strongSwan is therefore prone to run into issue 1. To fix both issues without breaking existing userland allow using the XFRMA_REPLAY_ESN_VAL netlink attribute with either an empty bitmap or a fully specified one. For the former case we initialize the in-kernel bitmap with zero, for the latter we copy the user supplied bitmap. For state updates the full bitmap must be supplied. To prevent overflows in the bitmap length calculation the maximum size of bmp_len is limited to 128 by this patch -- resulting in a maximum replay window of 4096 packets. This should be sufficient for all real life scenarios (RFC 4303 recommends a default replay window size of 64). Signed-off-by: Mathias Krause Cc: Steffen Klassert Cc: Martin Willi Cc: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 743b911d8b2214bfa9ecd1631edbe6f61f8fdced Author: Mathias Krause Date: Wed Sep 19 11:33:43 2012 +0000 xfrm_user: don't copy esn replay window twice for new states [ Upstream commit e3ac104d41a97b42316915020ba228c505447d21 ] The ESN replay window was already fully initialized in xfrm_alloc_replay_state_esn(). No need to copy it again. Signed-off-by: Mathias Krause Cc: Steffen Klassert Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0c5e37586ef83845acbae1738e693bc97c12d4c3 Author: Mathias Krause Date: Wed Sep 19 11:33:41 2012 +0000 xfrm_user: fix info leak in copy_to_user_tmpl() [ Upstream commit 1f86840f897717f86d523a13e99a447e6a5d2fa5 ] The memory used for the template copy is a local stack variable. As struct xfrm_user_tmpl contains multiple holes added by the compiler for alignment, not initializing the memory will lead to leaking stack bytes to userland. Add an explicit memset(0) to avoid the info leak. Initial version of the patch by Brad Spengler. Signed-off-by: Mathias Krause Cc: Brad Spengler Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 97f96eab8eb32f3178439f73acca4e286c091435 Author: Mathias Krause Date: Wed Sep 19 11:33:40 2012 +0000 xfrm_user: fix info leak in copy_to_user_policy() [ Upstream commit 7b789836f434c87168eab067cfbed1ec4783dffd ] The memory reserved to dump the xfrm policy includes multiple padding bytes added by the compiler for alignment (padding bytes in struct xfrm_selector and struct xfrm_userpolicy_info). Add an explicit memset(0) before filling the buffer to avoid the heap info leak. Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d5f1f7c230df5f2a198fb231547f1b298594c709 Author: Mathias Krause Date: Wed Sep 19 11:33:39 2012 +0000 xfrm_user: fix info leak in copy_to_user_state() [ Upstream commit f778a636713a435d3a922c60b1622a91136560c1 ] The memory reserved to dump the xfrm state includes the padding bytes of struct xfrm_usersa_info added by the compiler for alignment (7 for amd64, 3 for i386). Add an explicit memset(0) before filling the buffer to avoid the info leak. Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 37d61a27a59671d88279dcc4d331f950d4901d4d Author: Mathias Krause Date: Wed Sep 19 11:33:38 2012 +0000 xfrm_user: fix info leak in copy_to_user_auth() [ Upstream commit 4c87308bdea31a7b4828a51f6156e6f721a1fcc9 ] copy_to_user_auth() fails to initialize the remainder of alg_name and therefore discloses up to 54 bytes of heap memory via netlink to userland. Use strncpy() instead of strcpy() to fill the trailing bytes of alg_name with null bytes. Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a91af73f445cacfb0db4df3eb2e3d0ddeff43893 Author: Li RongQing Date: Mon Sep 17 22:40:10 2012 +0000 xfrm: fix a read lock imbalance in make_blackhole [ Upstream commit 433a19548061bb5457b6ab77ed7ea58ca6e43ddb ] if xfrm_policy_get_afinfo returns 0, it has already released the read lock, xfrm_policy_put_afinfo should not be called again. Signed-off-by: Li RongQing Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f38b334adca51bbf18ad549a9736c0f86bb4a375 Author: Mathias Krause Date: Fri Sep 14 09:58:32 2012 +0000 xfrm_user: return error pointer instead of NULL #2 [ Upstream commit c25463722509fef0ed630b271576a8c9a70236f3 ] When dump_one_policy() returns an error, e.g. because of a too small buffer to dump the whole xfrm policy, xfrm_policy_netlink() returns NULL instead of an error pointer. But its caller expects an error pointer and therefore continues to operate on a NULL skbuff. Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 555144b63d57c0df7a2677868f83957a34135207 Author: Mathias Krause Date: Thu Sep 13 11:41:26 2012 +0000 xfrm_user: return error pointer instead of NULL [ Upstream commit 864745d291b5ba80ea0bd0edcbe67273de368836 ] When dump_one_state() returns an error, e.g. because of a too small buffer to dump the whole xfrm state, xfrm_state_netlink() returns NULL instead of an error pointer. But its callers expect an error pointer and therefore continue to operate on a NULL skbuff. This could lead to a privilege escalation (execution of user code in kernel context) if the attacker has CAP_NET_ADMIN and is able to map address 0. Signed-off-by: Mathias Krause Acked-by: Steffen Klassert Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 20eb20851385e53d27dff9ed79c4e68e58e3d9da Author: Steffen Klassert Date: Tue Sep 4 00:03:29 2012 +0000 xfrm: Workaround incompatibility of ESN and async crypto [ Upstream commit 3b59df46a449ec9975146d71318c4777ad086744 ] ESN for esp is defined in RFC 4303. This RFC assumes that the sequence number counters are always up to date. However, this is not true if an async crypto algorithm is employed. If the sequence number counters are not up to date on sequence number check, we may incorrectly update the upper 32 bit of the sequence number. This leads to a DOS. We workaround this by comparing the upper sequence number, (used for authentication) with the upper sequence number computed after the async processing. We drop the packet if these numbers are different. To do this, we introduce a recheck function that does this check in the ESN case. Signed-off-by: Steffen Klassert Acked-by: Herbert Xu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 657197486950474bf30290344339fd0914fe99c9 Author: Michal Schmidt Date: Thu Sep 13 12:59:44 2012 +0000 bnx2x: fix rx checksum validation for IPv6 [ Upstream commit e488921f44765e8ab6c48ca35e3f6b78df9819df ] Commit d6cb3e41 "bnx2x: fix checksum validation" caused a performance regression for IPv6. Rx checksum offload does not work. IPv6 packets are passed to the stack with CHECKSUM_NONE. The hardware obviously cannot perform IP checksum validation for IPv6, because there is no checksum in the IPv6 header. This should not prevent us from setting CHECKSUM_UNNECESSARY. Tested on BCM57711. Signed-off-by: Michal Schmidt Acked-by: Eric Dumazet Acked-by: Eilon Greenstein Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e878ead68937be79e530ea9bf568766ab454e3ee Author: Yuta Ando Date: Mon Oct 1 23:24:30 2012 +0900 localmodconfig: Fix localyesconfig to set to 'y' not 'm' commit 4eae518d4b01b0cbf2f0d8edb5a6f3d6245ee8fb upstream. The kbuild target 'localyesconfig' has been same as 'localmodconfig' since the commit 50bce3e "kconfig/streamline_config.pl: merge local{mod,yes}config". The commit expects this script generates different configure depending on target, but it was not yet implemented. So I added code that sets to 'yes' when target is 'localyesconfig'. Link: http://lkml.kernel.org/r/1349101470-12243-1-git-send-email-yuta.and@gmail.com Signed-off-by: Yuta Ando Cc: linux-kbuild@vger.kernel.org Signed-off-by: Steven Rostedt Signed-off-by: Greg Kroah-Hartman commit 0699c6dd66e7083f837cfe620837ddcf10649b89 Author: Eric Sandeen Date: Sat Aug 18 22:29:40 2012 -0400 jbd2: don't write superblock when if its empty commit eeecef0af5ea4efd763c9554cf2bd80fc4a0efd3 upstream. This sequence: # truncate --size=1g fsfile # mkfs.ext4 -F fsfile # mount -o loop,ro fsfile /mnt # umount /mnt # dmesg | tail results in an IO error when unmounting the RO filesystem: [ 318.020828] Buffer I/O error on device loop1, logical block 196608 [ 318.027024] lost page write due to I/O error on loop1 [ 318.032088] JBD2: Error -5 detected when updating journal superblock for loop1-8. This was a regression introduced by commit 24bcc89c7e7c: "jbd2: split updating of journal superblock and marking journal empty". Signed-off-by: Eric Sandeen Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 66307aef43a3a51cccf03f4ca8063cfce1bd4ac1 Author: Tejun Heo Date: Fri Aug 3 10:30:45 2012 -0700 workqueue: add missing smp_wmb() in process_one_work() commit 959d1af8cffc8fd38ed53e8be1cf4ab8782f9c00 upstream. WORK_STRUCT_PENDING is used to claim ownership of a work item and process_one_work() releases it before starting execution. When someone else grabs PENDING, all pre-release updates to the work item should be visible and all updates made by the new owner should happen afterwards. Grabbing PENDING uses test_and_set_bit() and thus has a full barrier; however, clearing doesn't have a matching wmb. Given the preceding spin_unlock and use of clear_bit, I don't believe this can be a problem on an actual machine and there hasn't been any related report but it still is theretically possible for clear_pending to permeate upwards and happen before work->entry update. Add an explicit smp_wmb() before work_clear_pending(). Signed-off-by: Tejun Heo Cc: Oleg Nesterov Signed-off-by: Greg Kroah-Hartman commit 24a0c2063c805e9cf1f3f418bb7f444b5b3f0e4e Author: Feng Hong Date: Wed Sep 19 14:16:00 2012 +0200 PM / Sleep: use resume event when call dpm_resume_early commit 997a031107ec962967ce36db9bc500f1fad491c1 upstream. When dpm_suspend_noirq fail, state is PMSG_SUSPEND, should change to PMSG_RESUME when dpm_resume_early is called Signed-off-by: Feng Hong Signed-off-by: Raul Xiong Signed-off-by: Neil Zhang Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit a332509fcd9064cae48d26e09aec46e08247223b Author: Alexandre Bounine Date: Thu Oct 4 17:15:48 2012 -0700 rapidio/rionet: fix multicast packet transmit logic commit 7c4a6106d6451fc03c491e61df37c044505d843a upstream. Fix multicast packet transmit logic to account for repetitive transmission of single skb: - correct check for available buffers (this bug may produce NULL pointer crash dump in case of heavy traffic); - update skb user count (incorrect user counter causes a warning dump from net_tx_action routine during multicast transfers in systems with three or more rionet participants). Signed-off-by: Alexandre Bounine Cc: Matt Porter Cc: David S. Miller Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit ed55202fda5c68e44464cd5afdc819a8721b601a Author: Gavin Shan Date: Mon Sep 17 04:34:28 2012 +0000 powerpc/eeh: Fix crash on converting OF node to edev commit 1e38b7140185e384da216aff66a711df09b5afc9 upstream. The kernel crash was reported by Alexy. He was testing some feature with private kernel, in which Alexy added some code in pci_pm_reset() to read the CSR after writting it. The bug could be reproduced on Fiber Channel card (Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03)) by the following commands. # echo 1 > /sys/devices/pci0004:01/0004:01:00.0/reset # rmmod lpfc # modprobe lpfc The history behind the test case is that those additional config space reading operations in pci_pm_reset() would cause EEH error, but we didn't detect EEH error until "modprobe lpfc". For the case, all the PCI devices on PCI bus (0004:01) were removed and added after PE reset. Then the EEH devices would be figured out again based on the OF nodes. Unfortunately, there were some child OF nodes under PCI device (0004:01:00.0), but they didn't have attached PCI_DN since they're invisible from PCI domain. However, we were still trying to convert OF node to EEH device without checking on the attached PCI_DN. Eventually, it caused the kernel crash as follows: Unable to handle kernel paging request for data at address 0x00000030 Faulting instruction address: 0xc00000000004d888 cpu 0x0: Vector: 300 (Data Access) at [c000000fc797b950] pc: c00000000004d888: .eeh_add_device_tree_early+0x78/0x140 lr: c00000000004d880: .eeh_add_device_tree_early+0x70/0x140 sp: c000000fc797bbd0 msr: 8000000000009032 dar: 30 dsisr: 40000000 current = 0xc000000fc78d9f70 paca = 0xc00000000edb0000 softe: 0 irq_happened: 0x00 pid = 2951, comm = eehd enter ? for help [c000000fc797bc50] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140 [c000000fc797bcd0] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140 [c000000fc797bd50] c000000000051b54 .pcibios_add_pci_devices+0x34/0x190 [c000000fc797bde0] c00000000004fb10 .eeh_reset_device+0x100/0x160 [c000000fc797be70] c0000000000502dc .eeh_handle_event+0x19c/0x300 [c000000fc797bf00] c000000000050570 .eeh_event_handler+0x130/0x1a0 [c000000fc797bf90] c000000000020138 .kernel_thread+0x54/0x70 The patch changes of_node_to_eeh_dev() and just returns NULL if the passed OF node doesn't have attached PCI_DN. Reported-by: Alexey Kardashevskiy Signed-off-by: Gavin Shan Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit 9825e3158e302980b71c54b72e90f0624e1cfab3 Author: Rusty Russell Date: Thu Oct 4 12:03:25 2012 +0930 lguest: fix occasional crash in example launcher. commit ca16f580a5db7e60bfafe59a50bb133bd3347491 upstream. We usually got away with ->next on the final entry being NULL, but it finally bit me. Signed-off-by: Rusty Russell Signed-off-by: Greg Kroah-Hartman commit 0c0b534583fb1c02809e0a3978413fe79f7dac8f Author: Fabio Estevam Date: Thu Oct 4 17:11:16 2012 -0700 drivers/dma/dmaengine.c: lower the priority of 'failed to get' dma channel message commit 0eb5a35801df3c438ce3fc91310a415ea4452c00 upstream. Do the same as commit a03a202e95fd ("dmaengine: failure to get a specific DMA channel is not critical") to get rid of the following messages during kernel boot: dmaengine_get: failed to get dma1chan0: (-22) dmaengine_get: failed to get dma1chan1: (-22) dmaengine_get: failed to get dma1chan2: (-22) dmaengine_get: failed to get dma1chan3: (-22) .. Signed-off-by: Fabio Estevam Cc: Vinod Koul Cc: Dan Williams Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 03f0a11553d06002af53f2d399da6c711e7d3b40 Author: Martin Michlmayr Date: Thu Oct 4 17:11:25 2012 -0700 drivers/scsi/atp870u.c: fix bad use of udelay commit 0f6d93aa9d96cc9022b51bd10d462b03296be146 upstream. The ACARD driver calls udelay() with a value > 2000, which leads to to the following compilation error on ARM: ERROR: "__bad_udelay" [drivers/scsi/atp870u.ko] undefined! make[1]: *** [__modpost] Error 1 This is because udelay is defined on ARM, roughly speaking, as #define udelay(n) ((n) > 2000 ? __bad_udelay() : \ __const_udelay((n) * ((2199023U*HZ)>>11))) The argument to __const_udelay is the number of jiffies to wait divided by 4, but this does not work unless the multiplication does not overflow, and that is what the build error is designed to prevent. The intended behavior can be achieved by using mdelay to call udelay multiple times in a loop. [jrnieder@gmail.com: adding context] Signed-off-by: Martin Michlmayr Signed-off-by: Jonathan Nieder Cc: James Bottomley Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 3fc49ce67fefd3de38aa72a215eb1a2b68a7e89b Author: Shawn Guo Date: Thu Oct 4 17:12:23 2012 -0700 kernel/sys.c: call disable_nonboot_cpus() in kernel_restart() commit f96972f2dc6365421cf2366ebd61ee4cf060c8d5 upstream. As kernel_power_off() calls disable_nonboot_cpus(), we may also want to have kernel_restart() call disable_nonboot_cpus(). Doing so can help machines that require boot cpu be the last alive cpu during reboot to survive with kernel restart. This fixes one reboot issue seen on imx6q (Cortex-A9 Quad). The machine requires that the restart routine be run on the primary cpu rather than secondary ones. Otherwise, the secondary core running the restart routine will fail to come to online after reboot. Signed-off-by: Shawn Guo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 6d36a9dd1dcdd053ab32f303b94cc1bae5a47f22 Author: Davidlohr Bueso Date: Thu Oct 4 17:13:18 2012 -0700 lib/gcd.c: prevent possible div by 0 commit e96875677fb2b7cb739c5d7769824dff7260d31d upstream. Account for all properties when a and/or b are 0: gcd(0, 0) = 0 gcd(a, 0) = a gcd(0, b) = b Fixes no known problems in current kernels. Signed-off-by: Davidlohr Bueso Cc: Eric Dumazet Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 03c4441b88fd2614580eeda598623e2b21ec5822 Author: Mark Brown Date: Tue Aug 7 19:42:43 2012 +0100 mfd: max8925: Move _IO resources out of ioport_ioresource commit bee6e1fa617b1fb7f6f91033428410e05f5ab0ed upstream. The removal of mach/io.h from most ARM platforms also set the range of valid IO ports to be empty for most platforms when previously any 32 bit integer had been valid. This makes it impossible to add IO resources as the added range is smaller than that of the root resource for IO ports. Since we're not really using IO memory at all fix this by defining our own root resource outside the normal tree and make that the parent of all IO resources. This also ensures we won't conflict with read IO ports if we ever run on a platform which happens to use them. Signed-off-by: Mark Brown Acked-by: Arnd Bergmann Acked-by: Haojian Zhuang Tested-by: Haojian Zhuang Signed-off-by: Samuel Ortiz Signed-off-by: Greg Kroah-Hartman commit 95482ba9494658e69aecf0f97b64f3f2ab82af62 Author: Bjorn Helgaas Date: Wed Jun 20 16:18:29 2012 -0600 PCI: acpiphp: check whether _ADR evaluation succeeded commit dfb117b3e50c52c7b3416db4a4569224b8db80bb upstream. Check whether we evaluated _ADR successfully. Previously we ignored failure, so we would have used garbage data from the stack as the device and function number. We return AE_OK so that we ignore only this slot and continue looking for other slots. Found by Coverity (CID 113981). Signed-off-by: Bjorn Helgaas Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 2006e6f75e7cb88e0508eeff066834c847ce91b3 Author: Lin Ming Date: Mon Jul 16 16:30:21 2012 +0800 ACPI: run _OSC after ACPI_FULL_INITIALIZATION commit fc54ab72959edbf229b65ac74b2f122d799ca002 upstream. The _OSC method may exist in module level code, so it must be called after ACPI_FULL_INITIALIZATION On some new platforms with Zero-Power-Optical-Disk-Drive (ZPODD) support, this fix is necessary to save power. Signed-off-by: Lin Ming Tested-by: Aaron Lu Signed-off-by: Len Brown Signed-off-by: Greg Kroah-Hartman commit 991e71944f30eb7debd2d00abb749657b409b357 Author: Frank Schäfer Date: Sun Sep 9 15:02:19 2012 -0300 media: gspca_pac7302: add support for device 1ae7:2001 Speedlink Snappy Microphone SL-6825-SBK commit 97d2fbf501e3cf105ac957086c7e40e62e15cdf8 upstream. Signed-off-by: Frank Schäfer Signed-off-by: Hans de Goede Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 63c486377c32dcfd5775c04b4d0cc5bc53990787 Author: Ben Hutchings Date: Sun Aug 19 19:32:27 2012 -0300 media: rc: ite-cir: Initialise ite_dev::rdev earlier commit 4b961180ef275035b1538317ffd0e21e80e63e77 upstream. ite_dev::rdev is currently initialised in ite_probe() after rc_register_device() returns. If a newly registered device is opened quickly enough, we may enable interrupts and try to use ite_dev::rdev before it has been initialised. Move it up to the earliest point we can, right after calling rc_allocate_device(). Reported-and-tested-by: YunQiang Su Signed-off-by: Ben Hutchings Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 63b2f08626e0cacaa038c296df52a7cb504ae318 Author: Alex Williamson Date: Fri Nov 11 17:26:44 2011 -0700 intel-iommu: Default to non-coherent for domains unattached to iommus commit 2e12bc29fc5a12242d68e11875db3dd58efad9ff upstream. domain_update_iommu_coherency() currently defaults to setting domains as coherent when the domain is not attached to any iommus. This allows for a window in domain_context_mapping_one() where such a domain can update context entries non-coherently, and only after update the domain capability to clear iommu_coherency. This can be seen using KVM device assignment on VT-d systems that do not support coherency in the ecap register. When a device is added to a guest, a domain is created (iommu_coherency = 0), the device is attached, and ranges are mapped. If we then hot unplug the device, the coherency is updated and set to the default (1) since no iommus are attached to the domain. A subsequent attach of a device makes use of the same dmar domain (now marked coherent) updates context entries with coherency enabled, and only disables coherency as the last step in the process. To fix this, switch domain_update_iommu_coherency() to use the safer, non-coherent default for domains not attached to iommus. Signed-off-by: Alex Williamson Tested-by: Donald Dutile Acked-by: Donald Dutile Acked-by: Chris Wright Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 750025ab1950b5e32e308c446de32c399cf2d6f6 Author: Michael Wang Date: Wed Sep 5 10:33:18 2012 +0800 slab: fix the DEADLOCK issue on l3 alien lock commit 947ca1856a7e60aa6d20536785e6a42dff25aa6e upstream. DEADLOCK will be report while running a kernel with NUMA and LOCKDEP enabled, the process of this fake report is: kmem_cache_free() //free obj in cachep -> cache_free_alien() //acquire cachep's l3 alien lock -> __drain_alien_cache() -> free_block() -> slab_destroy() -> kmem_cache_free() //free slab in cachep->slabp_cache -> cache_free_alien() //acquire cachep->slabp_cache's l3 alien lock Since the cachep and cachep->slabp_cache's l3 alien are in the same lock class, fake report generated. This should not happen since we already have init_lock_keys() which will reassign the lock class for both l3 list and l3 alien. However, init_lock_keys() was invoked at a wrong position which is before we invoke enable_cpucache() on each cache. Since until set slab_state to be FULL, we won't invoke enable_cpucache() on caches to build their l3 alien while creating them, so although we invoked init_lock_keys(), the l3 alien lock class won't change since we don't have them until invoked enable_cpucache() later. This patch will invoke init_lock_keys() after we done enable_cpucache() instead of before to avoid the fake DEADLOCK report. Michael traced the problem back to a commit in release 3.0.0: commit 30765b92ada267c5395fc788623cb15233276f5c Author: Peter Zijlstra Date: Thu Jul 28 23:22:56 2011 +0200 slab, lockdep: Annotate the locks before using them Fernando found we hit the regular OFF_SLAB 'recursion' before we annotate the locks, cure this. The relevant portion of the stack-trace: > [ 0.000000] [] rt_spin_lock+0x50/0x56 > [ 0.000000] [] __cache_free+0x43/0xc3 > [ 0.000000] [] kmem_cache_free+0x6c/0xdc > [ 0.000000] [] slab_destroy+0x4f/0x53 > [ 0.000000] [] free_block+0x94/0xc1 > [ 0.000000] [] do_tune_cpucache+0x10b/0x2bb > [ 0.000000] [] enable_cpucache+0x7b/0xa7 > [ 0.000000] [] kmem_cache_init_late+0x1f/0x61 > [ 0.000000] [] start_kernel+0x24c/0x363 > [ 0.000000] [] i386_start_kernel+0xa9/0xaf Reported-by: Fernando Lopez-Lezcano Acked-by: Pekka Enberg Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1311888176.2617.379.camel@laptop Signed-off-by: Ingo Molnar The commit moved init_lock_keys() before we build up the alien, so we failed to reclass it. Acked-by: Christoph Lameter Tested-by: Paul E. McKenney Signed-off-by: Michael Wang Signed-off-by: Pekka Enberg Signed-off-by: Greg Kroah-Hartman commit 4893cf612a68d26cdbe0e16ba9c42772136e2340 Author: Jean Delvare Date: Tue Oct 2 16:42:36 2012 +0200 kbuild: Fix gcc -x syntax commit b1e0d8b70fa31821ebca3965f2ef8619d7c5e316 upstream. The correct syntax for gcc -x is "gcc -x assembler", not "gcc -xassembler". Even though the latter happens to work, the former is what is documented in the manual page and thus what gcc wrappers such as icecream do expect. This isn't a cosmetic change. The missing space prevents icecream from recognizing compilation tasks it can't handle, leading to silent kernel miscompilations. Besides me, credits go to Michael Matz and Dirk Mueller for investigating the miscompilation issue and tracking it down to this incorrect -x parameter syntax. Signed-off-by: Jean Delvare Acked-by: Ingo Molnar Cc: Bernhard Walle Cc: Michal Marek Cc: Ralf Baechle Signed-off-by: Michal Marek Signed-off-by: Greg Kroah-Hartman commit f15977883584e7b52832518c3fef115957d3203b Author: Sascha Hauer Date: Thu Oct 4 17:11:17 2012 -0700 kbuild: make: fix if_changed when command contains backslashes commit c353acba28fb3fa1fd05fd6b85a9fc7938330f9c upstream. The call if_changed mechanism does not work when the command contains backslashes. This basically is an issue with lzo and bzip2 compressed kernels. The compressed binaries do not contain the uncompressed image size, so these use size_append to append the size. This results in backslashes in the executed command. With this if_changed always detects a change in the command and rebuilds the compressed image even if nothing has changed. Fix this by escaping backslashes in make-cmd Signed-off-by: Sascha Hauer Signed-off-by: Jan Luebbe Cc: Sam Ravnborg Cc: Bernhard Walle Cc: Michal Marek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 605843502b0573c0c865b13f770d971aadaf0c41 Author: Geert Uytterhoeven Date: Thu Oct 4 17:11:13 2012 -0700 mn10300: only add -mmem-funcs to KBUILD_CFLAGS if gcc supports it commit 9957423f035c2071f6d1c5d2f095cdafbeb25ad7 upstream. It seems the current (gcc 4.6.3) no longer provides this so make it conditional. As reported by Tony before, the mn10300 architecture cross-compiles with gcc-4.6.3 if -mmem-funcs is not added to KBUILD_CFLAGS. Reported-by: Tony Breeds Signed-off-by: Geert Uytterhoeven Cc: David Howells Cc: Koichi Yasutake Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman