Update to 4.19.67

* Drop patches which have been applied to 4.19-stable
* Drop "Revert "net: stmmac: Send TSO packets always from Queue 0"" in
  favour of upstream fix "net: stmmac: Re-work the queue selection for
  TSO packets"
* Refresh patches that became fuzzy
This commit is contained in:
Ben Hutchings 2019-08-18 20:12:38 +01:00
parent 64c3754b90
commit 0899b0f554
97 changed files with 2216 additions and 11030 deletions

2146
debian/changelog vendored

File diff suppressed because it is too large Load Diff

View File

@ -1,52 +0,0 @@
From: Christoph Hellwig <hch@lst.de>
Date: Thu, 22 Nov 2018 16:44:07 +0100
Subject: [01/14] aio: clear IOCB_HIPRI
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=9101cbe70ef64c7f35fb75552005a3a696cc288e
commit 154989e45fd8de9bfb52bbd6e5ea763e437e54c5 upstream.
No one is going to poll for aio (yet), so we must clear the HIPRI
flag, as we would otherwise send it down the poll queues, where no
one will be polling for completions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
IOCB_HIPRI, not RWF_HIPRI.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 45d5ef8dd0a8..78aa249070b1 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1438,8 +1438,7 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb)
ret = ioprio_check_cap(iocb->aio_reqprio);
if (ret) {
pr_debug("aio ioprio check cap error: %d\n", ret);
- fput(req->ki_filp);
- return ret;
+ goto out_fput;
}
req->ki_ioprio = iocb->aio_reqprio;
@@ -1448,7 +1447,13 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb)
ret = kiocb_set_rw_flags(req, iocb->aio_rw_flags);
if (unlikely(ret))
- fput(req->ki_filp);
+ goto out_fput;
+
+ req->ki_flags &= ~IOCB_HIPRI; /* no one is going to poll for this I/O */
+ return 0;
+
+out_fput:
+ fput(req->ki_filp);
return ret;
}

View File

@ -1,52 +0,0 @@
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 11 Apr 2019 10:06:20 -0700
Subject: mm: make page ref count overflow check tighter and more explicit
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=9f6da5fd05577ef4a05c1744cc7098d0173823af
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-11487
commit f958d7b528b1b40c44cfda5eabe2d82760d868c3 upstream.
We have a VM_BUG_ON() to check that the page reference count doesn't
underflow (or get close to overflow) by checking the sign of the count.
That's all fine, but we actually want to allow people to use a "get page
ref unless it's already very high" helper function, and we want that one
to use the sign of the page ref (without triggering this VM_BUG_ON).
Change the VM_BUG_ON to only check for small underflows (or _very_ close
to overflowing), and ignore overflows which have strayed into negative
territory.
Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/mm.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e899460f1bc5..9965704813dc 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -915,6 +915,10 @@ static inline bool is_device_public_page(const struct page *page)
}
#endif /* CONFIG_DEV_PAGEMAP_OPS */
+/* 127: arbitrary random number, small enough to assemble well */
+#define page_ref_zero_or_close_to_overflow(page) \
+ ((unsigned int) page_ref_count(page) + 127u <= 127u)
+
static inline void get_page(struct page *page)
{
page = compound_head(page);
@@ -922,7 +926,7 @@ static inline void get_page(struct page *page)
* Getting a normal page or the head of a compound page
* requires to already have an elevated page->_refcount.
*/
- VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page);
+ VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page);
page_ref_inc(page);
}

View File

@ -1,32 +0,0 @@
From: Jens Axboe <axboe@kernel.dk>
Date: Tue, 6 Nov 2018 14:27:13 -0700
Subject: [02/14] aio: use assigned completion handler
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=b3373253f0bab538a7521537dfcb73e731b3d732
commit bc9bff61624ac33b7c95861abea1af24ee7a94fc upstream.
We know this is a read/write request, but in preparation for
having different kinds of those, ensure that we call the assigned
handler instead of assuming it's aio_complete_rq().
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/aio.c b/fs/aio.c
index 78aa249070b1..3df3fb0678e5 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1492,7 +1492,7 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret)
ret = -EINTR;
/*FALLTHRU*/
default:
- aio_complete_rw(req, ret, 0);
+ req->ki_complete(req, ret, 0);
}
}

View File

@ -1,56 +0,0 @@
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 11 Apr 2019 10:14:59 -0700
Subject: mm: add 'try_get_page()' helper function
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=0612cae7ec6b79d2ff1b34562bab79d5bf96327a
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-11487
commit 88b1a17dfc3ed7728316478fae0f5ad508f50397 upstream.
This is the same as the traditional 'get_page()' function, but instead
of unconditionally incrementing the reference count of the page, it only
does so if the count was "safe". It returns whether the reference count
was incremented (and is marked __must_check, since the caller obviously
has to be aware of it).
Also like 'get_page()', you can't use this function unless you already
had a reference to the page. The intent is that you can use this
exactly like get_page(), but in situations where you want to limit the
maximum reference count.
The code currently does an unconditional WARN_ON_ONCE() if we ever hit
the reference count issues (either zero or negative), as a notification
that the conditional non-increment actually happened.
NOTE! The count access for the "safety" check is inherently racy, but
that doesn't matter since the buffer we use is basically half the range
of the reference count (ie we look at the sign of the count).
Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/mm.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 9965704813dc..bdec425c8e14 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -930,6 +930,15 @@ static inline void get_page(struct page *page)
page_ref_inc(page);
}
+static inline __must_check bool try_get_page(struct page *page)
+{
+ page = compound_head(page);
+ if (WARN_ON_ONCE(page_ref_count(page) <= 0))
+ return false;
+ page_ref_inc(page);
+ return true;
+}
+
static inline void put_page(struct page *page)
{
page = compound_head(page);

View File

@ -1,101 +0,0 @@
From: Christoph Hellwig <hch@lst.de>
Date: Mon, 19 Nov 2018 15:57:42 -0700
Subject: [03/14] aio: separate out ring reservation from req allocation
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=730198c889d85db78058cfb57c1b41c65f55c94e
commit 432c79978c33ecef91b1b04cea6936c20810da29 upstream.
This is in preparation for certain types of IO not needing a ring
reserveration.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 30 +++++++++++++++++-------------
1 file changed, 17 insertions(+), 13 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 3df3fb0678e5..b9e0df08277b 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -902,7 +902,7 @@ static void put_reqs_available(struct kioctx *ctx, unsigned nr)
local_irq_restore(flags);
}
-static bool get_reqs_available(struct kioctx *ctx)
+static bool __get_reqs_available(struct kioctx *ctx)
{
struct kioctx_cpu *kcpu;
bool ret = false;
@@ -994,6 +994,14 @@ static void user_refill_reqs_available(struct kioctx *ctx)
spin_unlock_irq(&ctx->completion_lock);
}
+static bool get_reqs_available(struct kioctx *ctx)
+{
+ if (__get_reqs_available(ctx))
+ return true;
+ user_refill_reqs_available(ctx);
+ return __get_reqs_available(ctx);
+}
+
/* aio_get_req
* Allocate a slot for an aio request.
* Returns NULL if no requests are free.
@@ -1002,24 +1010,15 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx)
{
struct aio_kiocb *req;
- if (!get_reqs_available(ctx)) {
- user_refill_reqs_available(ctx);
- if (!get_reqs_available(ctx))
- return NULL;
- }
-
req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO);
if (unlikely(!req))
- goto out_put;
+ return NULL;
percpu_ref_get(&ctx->reqs);
INIT_LIST_HEAD(&req->ki_list);
refcount_set(&req->ki_refcnt, 0);
req->ki_ctx = ctx;
return req;
-out_put:
- put_reqs_available(ctx, 1);
- return NULL;
}
static struct kioctx *lookup_ioctx(unsigned long ctx_id)
@@ -1813,9 +1812,13 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
return -EINVAL;
}
+ if (!get_reqs_available(ctx))
+ return -EAGAIN;
+
+ ret = -EAGAIN;
req = aio_get_req(ctx);
if (unlikely(!req))
- return -EAGAIN;
+ goto out_put_reqs_available;
if (iocb.aio_flags & IOCB_FLAG_RESFD) {
/*
@@ -1878,11 +1881,12 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
goto out_put_req;
return 0;
out_put_req:
- put_reqs_available(ctx, 1);
percpu_ref_put(&ctx->reqs);
if (req->ki_eventfd)
eventfd_ctx_put(req->ki_eventfd);
kmem_cache_free(kiocb_cachep, req);
+out_put_reqs_available:
+ put_reqs_available(ctx, 1);
return ret;
}

View File

@ -1,155 +0,0 @@
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 11 Apr 2019 10:49:19 -0700
Subject: mm: prevent get_user_pages() from overflowing page refcount
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=d972ebbf42ba6712460308ae57c222a0706f2af3
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-11487
commit 8fde12ca79aff9b5ba951fce1a2641901b8d8e64 upstream.
If the page refcount wraps around past zero, it will be freed while
there are still four billion references to it. One of the possible
avenues for an attacker to try to make this happen is by doing direct IO
on a page multiple times. This patch makes get_user_pages() refuse to
take a new page reference if there are already more than two billion
references to the page.
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/gup.c | 45 ++++++++++++++++++++++++++++++++++-----------
mm/hugetlb.c | 13 +++++++++++++
2 files changed, 47 insertions(+), 11 deletions(-)
diff --git a/mm/gup.c b/mm/gup.c
index 0a5374e6e82d..caadd31714a5 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -153,7 +153,10 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
}
if (flags & FOLL_GET) {
- get_page(page);
+ if (unlikely(!try_get_page(page))) {
+ page = ERR_PTR(-ENOMEM);
+ goto out;
+ }
/* drop the pgmap reference now that we hold the page */
if (pgmap) {
@@ -296,7 +299,10 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma,
if (pmd_trans_unstable(pmd))
ret = -EBUSY;
} else {
- get_page(page);
+ if (unlikely(!try_get_page(page))) {
+ spin_unlock(ptl);
+ return ERR_PTR(-ENOMEM);
+ }
spin_unlock(ptl);
lock_page(page);
ret = split_huge_page(page);
@@ -480,7 +486,10 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address,
if (is_device_public_page(*page))
goto unmap;
}
- get_page(*page);
+ if (unlikely(!try_get_page(*page))) {
+ ret = -ENOMEM;
+ goto unmap;
+ }
out:
ret = 0;
unmap:
@@ -1368,6 +1377,20 @@ static void undo_dev_pagemap(int *nr, int nr_start, struct page **pages)
}
}
+/*
+ * Return the compund head page with ref appropriately incremented,
+ * or NULL if that failed.
+ */
+static inline struct page *try_get_compound_head(struct page *page, int refs)
+{
+ struct page *head = compound_head(page);
+ if (WARN_ON_ONCE(page_ref_count(head) < 0))
+ return NULL;
+ if (unlikely(!page_cache_add_speculative(head, refs)))
+ return NULL;
+ return head;
+}
+
#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
int write, struct page **pages, int *nr)
@@ -1402,9 +1425,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
page = pte_page(pte);
- head = compound_head(page);
- if (!page_cache_get_speculative(head))
+ head = try_get_compound_head(page, 1);
+ if (!head)
goto pte_unmap;
if (unlikely(pte_val(pte) != pte_val(*ptep))) {
@@ -1543,8 +1566,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
refs++;
} while (addr += PAGE_SIZE, addr != end);
- head = compound_head(pmd_page(orig));
- if (!page_cache_add_speculative(head, refs)) {
+ head = try_get_compound_head(pmd_page(orig), refs);
+ if (!head) {
*nr -= refs;
return 0;
}
@@ -1581,8 +1604,8 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
refs++;
} while (addr += PAGE_SIZE, addr != end);
- head = compound_head(pud_page(orig));
- if (!page_cache_add_speculative(head, refs)) {
+ head = try_get_compound_head(pud_page(orig), refs);
+ if (!head) {
*nr -= refs;
return 0;
}
@@ -1618,8 +1641,8 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
refs++;
} while (addr += PAGE_SIZE, addr != end);
- head = compound_head(pgd_page(orig));
- if (!page_cache_add_speculative(head, refs)) {
+ head = try_get_compound_head(pgd_page(orig), refs);
+ if (!head) {
*nr -= refs;
return 0;
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9e5f66cbf711..5fb779cda972 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4299,6 +4299,19 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
pfn_offset = (vaddr & ~huge_page_mask(h)) >> PAGE_SHIFT;
page = pte_page(huge_ptep_get(pte));
+
+ /*
+ * Instead of doing 'try_get_page()' below in the same_page
+ * loop, just check the count once here.
+ */
+ if (unlikely(page_count(page) <= 0)) {
+ if (pages) {
+ spin_unlock(ptl);
+ remainder = 0;
+ err = -ENOMEM;
+ break;
+ }
+ }
same_page:
if (pages) {
pages[i] = mem_map_offset(page, pfn_offset);

View File

@ -1,52 +0,0 @@
From: Jens Axboe <axboe@kernel.dk>
Date: Tue, 4 Dec 2018 09:44:49 -0700
Subject: [04/14] aio: don't zero entire aio_kiocb aio_get_req()
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=ef529eead8cfc11c051b90d239e7137f7141ea94
commit 2bc4ca9bb600cbe36941da2b2a67189fc4302a04 upstream.
It's 192 bytes, fairly substantial. Most items don't need to be cleared,
especially not upfront. Clear the ones we do need to clear, and leave
the other ones for setup when the iocb is prepared and submitted.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index b9e0df08277b..2547f17b4fef 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1010,14 +1010,15 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx)
{
struct aio_kiocb *req;
- req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO);
+ req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL);
if (unlikely(!req))
return NULL;
percpu_ref_get(&ctx->reqs);
+ req->ki_ctx = ctx;
INIT_LIST_HEAD(&req->ki_list);
refcount_set(&req->ki_refcnt, 0);
- req->ki_ctx = ctx;
+ req->ki_eventfd = NULL;
return req;
}
@@ -1738,6 +1739,10 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb)
if (unlikely(!req->file))
return -EBADF;
+ req->head = NULL;
+ req->woken = false;
+ req->cancelled = false;
+
apt.pt._qproc = aio_poll_queue_proc;
apt.pt._key = req->events;
apt.iocb = aiocb;

View File

@ -1,162 +0,0 @@
From: Matthew Wilcox <willy@infradead.org>
Date: Fri, 5 Apr 2019 14:02:10 -0700
Subject: fs: prevent page refcount overflow in pipe_buf_get
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=0311ff82b70fa12e80d188635bff24029ec06ae1
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-11487
commit 15fab63e1e57be9fdb5eec1bbc5916e9825e9acb upstream.
Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount. All
callers converted to handle a failure.
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/fuse/dev.c | 12 ++++++------
fs/pipe.c | 4 ++--
fs/splice.c | 12 ++++++++++--
include/linux/pipe_fs_i.h | 10 ++++++----
kernel/trace/trace.c | 6 +++++-
5 files changed, 29 insertions(+), 15 deletions(-)
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1989,10 +1989,8 @@ static ssize_t fuse_dev_splice_write(str
rem += pipe->bufs[(pipe->curbuf + idx) & (pipe->buffers - 1)].len;
ret = -EINVAL;
- if (rem < len) {
- pipe_unlock(pipe);
- goto out;
- }
+ if (rem < len)
+ goto out_free;
rem = len;
while (rem) {
@@ -2010,7 +2008,9 @@ static ssize_t fuse_dev_splice_write(str
pipe->curbuf = (pipe->curbuf + 1) & (pipe->buffers - 1);
pipe->nrbufs--;
} else {
- pipe_buf_get(pipe, ibuf);
+ if (!pipe_buf_get(pipe, ibuf))
+ goto out_free;
+
*obuf = *ibuf;
obuf->flags &= ~PIPE_BUF_FLAG_GIFT;
obuf->len = rem;
@@ -2033,11 +2033,11 @@ static ssize_t fuse_dev_splice_write(str
ret = fuse_dev_do_write(fud, &cs, len);
pipe_lock(pipe);
+out_free:
for (idx = 0; idx < nbuf; idx++)
pipe_buf_release(pipe, &bufs[idx]);
pipe_unlock(pipe);
-out:
kvfree(bufs);
return ret;
}
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -189,9 +189,9 @@ EXPORT_SYMBOL(generic_pipe_buf_steal);
* in the tee() system call, when we duplicate the buffers in one
* pipe into another.
*/
-void generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf)
+bool generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf)
{
- get_page(buf->page);
+ return try_get_page(buf->page);
}
EXPORT_SYMBOL(generic_pipe_buf_get);
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1586,7 +1586,11 @@ retry:
* Get a reference to this pipe buffer,
* so we can copy the contents over.
*/
- pipe_buf_get(ipipe, ibuf);
+ if (!pipe_buf_get(ipipe, ibuf)) {
+ if (ret == 0)
+ ret = -EFAULT;
+ break;
+ }
*obuf = *ibuf;
/*
@@ -1660,7 +1664,11 @@ static int link_pipe(struct pipe_inode_i
* Get a reference to this pipe buffer,
* so we can copy the contents over.
*/
- pipe_buf_get(ipipe, ibuf);
+ if (!pipe_buf_get(ipipe, ibuf)) {
+ if (ret == 0)
+ ret = -EFAULT;
+ break;
+ }
obuf = opipe->bufs + nbuf;
*obuf = *ibuf;
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -108,18 +108,20 @@ struct pipe_buf_operations {
/*
* Get a reference to the pipe buffer.
*/
- void (*get)(struct pipe_inode_info *, struct pipe_buffer *);
+ bool (*get)(struct pipe_inode_info *, struct pipe_buffer *);
};
/**
* pipe_buf_get - get a reference to a pipe_buffer
* @pipe: the pipe that the buffer belongs to
* @buf: the buffer to get a reference to
+ *
+ * Return: %true if the reference was successfully obtained.
*/
-static inline void pipe_buf_get(struct pipe_inode_info *pipe,
+static inline __must_check bool pipe_buf_get(struct pipe_inode_info *pipe,
struct pipe_buffer *buf)
{
- buf->ops->get(pipe, buf);
+ return buf->ops->get(pipe, buf);
}
/**
@@ -178,7 +180,7 @@ struct pipe_inode_info *alloc_pipe_info(
void free_pipe_info(struct pipe_inode_info *);
/* Generic pipe buffer ops functions */
-void generic_pipe_buf_get(struct pipe_inode_info *, struct pipe_buffer *);
+bool generic_pipe_buf_get(struct pipe_inode_info *, struct pipe_buffer *);
int generic_pipe_buf_confirm(struct pipe_inode_info *, struct pipe_buffer *);
int generic_pipe_buf_steal(struct pipe_inode_info *, struct pipe_buffer *);
int generic_pipe_buf_nosteal(struct pipe_inode_info *, struct pipe_buffer *);
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6820,12 +6820,16 @@ static void buffer_pipe_buf_release(stru
buf->private = 0;
}
-static void buffer_pipe_buf_get(struct pipe_inode_info *pipe,
+static bool buffer_pipe_buf_get(struct pipe_inode_info *pipe,
struct pipe_buffer *buf)
{
struct buffer_ref *ref = (struct buffer_ref *)buf->private;
+ if (refcount_read(&ref->refcount) > INT_MAX/2)
+ return false;
+
refcount_inc(&ref->refcount);
+ return true;
}
/* Pipe buffer operations for a buffer. */

View File

@ -1,34 +0,0 @@
From: Jens Axboe <axboe@kernel.dk>
Date: Sat, 24 Nov 2018 21:33:09 -0700
Subject: [05/14] aio: use iocb_put() instead of open coding it
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=4d677689742ab60d5be46e20708276368564427a
commit 71ebc6fef0f53459f37fb39e1466792232fa52ee upstream.
Replace the percpu_ref_put() + kmem_cache_free() with a call to
iocb_put() instead.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 2547f17b4fef..e2b63ab28ecc 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1886,10 +1886,9 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
goto out_put_req;
return 0;
out_put_req:
- percpu_ref_put(&ctx->reqs);
if (req->ki_eventfd)
eventfd_ctx_put(req->ki_eventfd);
- kmem_cache_free(kiocb_cachep, req);
+ iocb_put(req);
out_put_reqs_available:
put_reqs_available(ctx, 1);
return ret;

View File

@ -1,194 +0,0 @@
From: Jens Axboe <axboe@kernel.dk>
Date: Sat, 24 Nov 2018 14:46:14 -0700
Subject: [06/14] aio: split out iocb copy from io_submit_one()
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=d384f8b855a573ea301fd7f5558cc64cb22107e6
commit 88a6f18b950e2e4dce57d31daa151105f4f3dcff upstream.
In preparation of handing in iocbs in a different fashion as well. Also
make it clear that the iocb being passed in isn't modified, by marking
it const throughout.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 68 +++++++++++++++++++++++++++++++-------------------------
1 file changed, 38 insertions(+), 30 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index e2b63ab28ecc..6e1da220f04b 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1416,7 +1416,7 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2)
aio_complete(iocb, res, res2);
}
-static int aio_prep_rw(struct kiocb *req, struct iocb *iocb)
+static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
{
int ret;
@@ -1457,7 +1457,7 @@ static int aio_prep_rw(struct kiocb *req, struct iocb *iocb)
return ret;
}
-static int aio_setup_rw(int rw, struct iocb *iocb, struct iovec **iovec,
+static int aio_setup_rw(int rw, const struct iocb *iocb, struct iovec **iovec,
bool vectored, bool compat, struct iov_iter *iter)
{
void __user *buf = (void __user *)(uintptr_t)iocb->aio_buf;
@@ -1496,8 +1496,8 @@ static inline void aio_rw_done(struct kiocb *req, ssize_t ret)
}
}
-static ssize_t aio_read(struct kiocb *req, struct iocb *iocb, bool vectored,
- bool compat)
+static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb,
+ bool vectored, bool compat)
{
struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
struct iov_iter iter;
@@ -1529,8 +1529,8 @@ static ssize_t aio_read(struct kiocb *req, struct iocb *iocb, bool vectored,
return ret;
}
-static ssize_t aio_write(struct kiocb *req, struct iocb *iocb, bool vectored,
- bool compat)
+static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb,
+ bool vectored, bool compat)
{
struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
struct iov_iter iter;
@@ -1585,7 +1585,8 @@ static void aio_fsync_work(struct work_struct *work)
aio_complete(container_of(req, struct aio_kiocb, fsync), ret, 0);
}
-static int aio_fsync(struct fsync_iocb *req, struct iocb *iocb, bool datasync)
+static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
+ bool datasync)
{
if (unlikely(iocb->aio_buf || iocb->aio_offset || iocb->aio_nbytes ||
iocb->aio_rw_flags))
@@ -1719,7 +1720,7 @@ aio_poll_queue_proc(struct file *file, struct wait_queue_head *head,
add_wait_queue(head, &pt->iocb->poll.wait);
}
-static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb)
+static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
{
struct kioctx *ctx = aiocb->ki_ctx;
struct poll_iocb *req = &aiocb->poll;
@@ -1791,27 +1792,23 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb)
return 0;
}
-static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
- bool compat)
+static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb,
+ struct iocb __user *user_iocb, bool compat)
{
struct aio_kiocb *req;
- struct iocb iocb;
ssize_t ret;
- if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb))))
- return -EFAULT;
-
/* enforce forwards compatibility on users */
- if (unlikely(iocb.aio_reserved2)) {
+ if (unlikely(iocb->aio_reserved2)) {
pr_debug("EINVAL: reserve field set\n");
return -EINVAL;
}
/* prevent overflows */
if (unlikely(
- (iocb.aio_buf != (unsigned long)iocb.aio_buf) ||
- (iocb.aio_nbytes != (size_t)iocb.aio_nbytes) ||
- ((ssize_t)iocb.aio_nbytes < 0)
+ (iocb->aio_buf != (unsigned long)iocb->aio_buf) ||
+ (iocb->aio_nbytes != (size_t)iocb->aio_nbytes) ||
+ ((ssize_t)iocb->aio_nbytes < 0)
)) {
pr_debug("EINVAL: overflow check\n");
return -EINVAL;
@@ -1825,14 +1822,14 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
if (unlikely(!req))
goto out_put_reqs_available;
- if (iocb.aio_flags & IOCB_FLAG_RESFD) {
+ if (iocb->aio_flags & IOCB_FLAG_RESFD) {
/*
* If the IOCB_FLAG_RESFD flag of aio_flags is set, get an
* instance of the file* now. The file descriptor must be
* an eventfd() fd, and will be signaled for each completed
* event using the eventfd_signal() function.
*/
- req->ki_eventfd = eventfd_ctx_fdget((int) iocb.aio_resfd);
+ req->ki_eventfd = eventfd_ctx_fdget((int) iocb->aio_resfd);
if (IS_ERR(req->ki_eventfd)) {
ret = PTR_ERR(req->ki_eventfd);
req->ki_eventfd = NULL;
@@ -1847,32 +1844,32 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
}
req->ki_user_iocb = user_iocb;
- req->ki_user_data = iocb.aio_data;
+ req->ki_user_data = iocb->aio_data;
- switch (iocb.aio_lio_opcode) {
+ switch (iocb->aio_lio_opcode) {
case IOCB_CMD_PREAD:
- ret = aio_read(&req->rw, &iocb, false, compat);
+ ret = aio_read(&req->rw, iocb, false, compat);
break;
case IOCB_CMD_PWRITE:
- ret = aio_write(&req->rw, &iocb, false, compat);
+ ret = aio_write(&req->rw, iocb, false, compat);
break;
case IOCB_CMD_PREADV:
- ret = aio_read(&req->rw, &iocb, true, compat);
+ ret = aio_read(&req->rw, iocb, true, compat);
break;
case IOCB_CMD_PWRITEV:
- ret = aio_write(&req->rw, &iocb, true, compat);
+ ret = aio_write(&req->rw, iocb, true, compat);
break;
case IOCB_CMD_FSYNC:
- ret = aio_fsync(&req->fsync, &iocb, false);
+ ret = aio_fsync(&req->fsync, iocb, false);
break;
case IOCB_CMD_FDSYNC:
- ret = aio_fsync(&req->fsync, &iocb, true);
+ ret = aio_fsync(&req->fsync, iocb, true);
break;
case IOCB_CMD_POLL:
- ret = aio_poll(req, &iocb);
+ ret = aio_poll(req, iocb);
break;
default:
- pr_debug("invalid aio operation %d\n", iocb.aio_lio_opcode);
+ pr_debug("invalid aio operation %d\n", iocb->aio_lio_opcode);
ret = -EINVAL;
break;
}
@@ -1894,6 +1891,17 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
return ret;
}
+static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
+ bool compat)
+{
+ struct iocb iocb;
+
+ if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb))))
+ return -EFAULT;
+
+ return __io_submit_one(ctx, &iocb, user_iocb, compat);
+}
+
/* sys_io_submit:
* Queue the nr iocbs pointed to by iocbpp for processing. Returns
* the number of iocbs queued. May return -EINVAL if the aio_context

View File

@ -1,47 +0,0 @@
From: Jens Axboe <axboe@kernel.dk>
Date: Tue, 20 Nov 2018 20:06:23 -0700
Subject: [07/14] aio: abstract out io_event filler helper
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=a812f7b68a3940e0369fd0fb24febec794a67623
commit 875736bb3f3ded168469f6a14df7a938416a99d5 upstream.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 6e1da220f04b..f6ce01ca6903 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1059,6 +1059,15 @@ static inline void iocb_put(struct aio_kiocb *iocb)
}
}
+static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb,
+ long res, long res2)
+{
+ ev->obj = (u64)(unsigned long)iocb->ki_user_iocb;
+ ev->data = iocb->ki_user_data;
+ ev->res = res;
+ ev->res2 = res2;
+}
+
/* aio_complete
* Called when the io request on the given iocb is complete.
*/
@@ -1086,10 +1095,7 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2)
ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
event = ev_page + pos % AIO_EVENTS_PER_PAGE;
- event->obj = (u64)(unsigned long)iocb->ki_user_iocb;
- event->data = iocb->ki_user_data;
- event->res = res;
- event->res2 = res2;
+ aio_fill_event(event, iocb, res, res2);
kunmap_atomic(ev_page);
flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);

View File

@ -1,32 +0,0 @@
From: Mike Marshall <hubcap@omnibond.com>
Date: Tue, 5 Feb 2019 14:13:35 -0500
Subject: [08/14] aio: initialize kiocb private in case any filesystems expect
it.
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=2afa01cd9186974051b38b7d1f31bb2407e41e3a
commit ec51f8ee1e63498e9f521ec0e5a6d04622bb2c67 upstream.
A recent optimization had left private uninitialized.
Fixes: 2bc4ca9bb600 ("aio: don't zero entire aio_kiocb aio_get_req()")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/aio.c b/fs/aio.c
index f6ce01ca6903..d74fc9e112ac 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1430,6 +1430,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
if (unlikely(!req->ki_filp))
return -EBADF;
req->ki_complete = aio_complete_rw;
+ req->private = NULL;
req->ki_pos = iocb->aio_offset;
req->ki_flags = iocb_flags(req->ki_filp);
if (iocb->aio_flags & IOCB_FLAG_RESFD)

View File

@ -1,312 +0,0 @@
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sun, 3 Mar 2019 14:23:33 -0800
Subject: [09/14] aio: simplify - and fix - fget/fput for io_submit()
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=d6b2615f7d31d8e58b685d42dbafcc7dc1204bbd
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-10125
commit 84c4e1f89fefe70554da0ab33be72c9be7994379 upstream.
Al Viro root-caused a race where the IOCB_CMD_POLL handling of
fget/fput() could cause us to access the file pointer after it had
already been freed:
"In more details - normally IOCB_CMD_POLL handling looks so:
1) io_submit(2) allocates aio_kiocb instance and passes it to
aio_poll()
2) aio_poll() resolves the descriptor to struct file by req->file =
fget(iocb->aio_fildes)
3) aio_poll() sets ->woken to false and raises ->ki_refcnt of that
aio_kiocb to 2 (bumps by 1, that is).
4) aio_poll() calls vfs_poll(). After sanity checks (basically,
"poll_wait() had been called and only once") it locks the queue.
That's what the extra reference to iocb had been for - we know we
can safely access it.
5) With queue locked, we check if ->woken has already been set to
true (by aio_poll_wake()) and, if it had been, we unlock the
queue, drop a reference to aio_kiocb and bugger off - at that
point it's a responsibility to aio_poll_wake() and the stuff
called/scheduled by it. That code will drop the reference to file
in req->file, along with the other reference to our aio_kiocb.
6) otherwise, we see whether we need to wait. If we do, we unlock the
queue, drop one reference to aio_kiocb and go away - eventual
wakeup (or cancel) will deal with the reference to file and with
the other reference to aio_kiocb
7) otherwise we remove ourselves from waitqueue (still under the
queue lock), so that wakeup won't get us. No async activity will
be happening, so we can safely drop req->file and iocb ourselves.
If wakeup happens while we are in vfs_poll(), we are fine - aio_kiocb
won't get freed under us, so we can do all the checks and locking
safely. And we don't touch ->file if we detect that case.
However, vfs_poll() most certainly *does* touch the file it had been
given. So wakeup coming while we are still in ->poll() might end up
doing fput() on that file. That case is not too rare, and usually we
are saved by the still present reference from descriptor table - that
fput() is not the final one.
But if another thread closes that descriptor right after our fget()
and wakeup does happen before ->poll() returns, we are in trouble -
final fput() done while we are in the middle of a method:
Al also wrote a patch to take an extra reference to the file descriptor
to fix this, but I instead suggested we just streamline the whole file
pointer handling by submit_io() so that the generic aio submission code
simply keeps the file pointer around until the aio has completed.
Fixes: bfe4037e722e ("aio: implement IOCB_CMD_POLL")
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: syzbot+503d4cc169fcec1cb18c@syzkaller.appspotmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 72 +++++++++++++++++++---------------------------
include/linux/fs.h | 8 +++++-
2 files changed, 36 insertions(+), 44 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index d74fc9e112ac..46229e663b57 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -161,9 +161,13 @@ struct kioctx {
unsigned id;
};
+/*
+ * First field must be the file pointer in all the
+ * iocb unions! See also 'struct kiocb' in <linux/fs.h>
+ */
struct fsync_iocb {
- struct work_struct work;
struct file *file;
+ struct work_struct work;
bool datasync;
};
@@ -177,8 +181,15 @@ struct poll_iocb {
struct work_struct work;
};
+/*
+ * NOTE! Each of the iocb union members has the file pointer
+ * as the first entry in their struct definition. So you can
+ * access the file pointer through any of the sub-structs,
+ * or directly as just 'ki_filp' in this struct.
+ */
struct aio_kiocb {
union {
+ struct file *ki_filp;
struct kiocb rw;
struct fsync_iocb fsync;
struct poll_iocb poll;
@@ -1054,6 +1065,8 @@ static inline void iocb_put(struct aio_kiocb *iocb)
{
if (refcount_read(&iocb->ki_refcnt) == 0 ||
refcount_dec_and_test(&iocb->ki_refcnt)) {
+ if (iocb->ki_filp)
+ fput(iocb->ki_filp);
percpu_ref_put(&iocb->ki_ctx->reqs);
kmem_cache_free(kiocb_cachep, iocb);
}
@@ -1418,7 +1431,6 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2)
file_end_write(kiocb->ki_filp);
}
- fput(kiocb->ki_filp);
aio_complete(iocb, res, res2);
}
@@ -1426,9 +1438,6 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
{
int ret;
- req->ki_filp = fget(iocb->aio_fildes);
- if (unlikely(!req->ki_filp))
- return -EBADF;
req->ki_complete = aio_complete_rw;
req->private = NULL;
req->ki_pos = iocb->aio_offset;
@@ -1445,7 +1454,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
ret = ioprio_check_cap(iocb->aio_reqprio);
if (ret) {
pr_debug("aio ioprio check cap error: %d\n", ret);
- goto out_fput;
+ return ret;
}
req->ki_ioprio = iocb->aio_reqprio;
@@ -1454,14 +1463,10 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
ret = kiocb_set_rw_flags(req, iocb->aio_rw_flags);
if (unlikely(ret))
- goto out_fput;
+ return ret;
req->ki_flags &= ~IOCB_HIPRI; /* no one is going to poll for this I/O */
return 0;
-
-out_fput:
- fput(req->ki_filp);
- return ret;
}
static int aio_setup_rw(int rw, const struct iocb *iocb, struct iovec **iovec,
@@ -1515,24 +1520,19 @@ static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb,
if (ret)
return ret;
file = req->ki_filp;
-
- ret = -EBADF;
if (unlikely(!(file->f_mode & FMODE_READ)))
- goto out_fput;
+ return -EBADF;
ret = -EINVAL;
if (unlikely(!file->f_op->read_iter))
- goto out_fput;
+ return -EINVAL;
ret = aio_setup_rw(READ, iocb, &iovec, vectored, compat, &iter);
if (ret)
- goto out_fput;
+ return ret;
ret = rw_verify_area(READ, file, &req->ki_pos, iov_iter_count(&iter));
if (!ret)
aio_rw_done(req, call_read_iter(file, req, &iter));
kfree(iovec);
-out_fput:
- if (unlikely(ret))
- fput(file);
return ret;
}
@@ -1549,16 +1549,14 @@ static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb,
return ret;
file = req->ki_filp;
- ret = -EBADF;
if (unlikely(!(file->f_mode & FMODE_WRITE)))
- goto out_fput;
- ret = -EINVAL;
+ return -EBADF;
if (unlikely(!file->f_op->write_iter))
- goto out_fput;
+ return -EINVAL;
ret = aio_setup_rw(WRITE, iocb, &iovec, vectored, compat, &iter);
if (ret)
- goto out_fput;
+ return ret;
ret = rw_verify_area(WRITE, file, &req->ki_pos, iov_iter_count(&iter));
if (!ret) {
/*
@@ -1576,9 +1574,6 @@ static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb,
aio_rw_done(req, call_write_iter(file, req, &iter));
}
kfree(iovec);
-out_fput:
- if (unlikely(ret))
- fput(file);
return ret;
}
@@ -1588,7 +1583,6 @@ static void aio_fsync_work(struct work_struct *work)
int ret;
ret = vfs_fsync(req->file, req->datasync);
- fput(req->file);
aio_complete(container_of(req, struct aio_kiocb, fsync), ret, 0);
}
@@ -1599,13 +1593,8 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
iocb->aio_rw_flags))
return -EINVAL;
- req->file = fget(iocb->aio_fildes);
- if (unlikely(!req->file))
- return -EBADF;
- if (unlikely(!req->file->f_op->fsync)) {
- fput(req->file);
+ if (unlikely(!req->file->f_op->fsync))
return -EINVAL;
- }
req->datasync = datasync;
INIT_WORK(&req->work, aio_fsync_work);
@@ -1615,10 +1604,7 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
static inline void aio_poll_complete(struct aio_kiocb *iocb, __poll_t mask)
{
- struct file *file = iocb->poll.file;
-
aio_complete(iocb, mangle_poll(mask), 0);
- fput(file);
}
static void aio_poll_complete_work(struct work_struct *work)
@@ -1743,9 +1729,6 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
INIT_WORK(&req->work, aio_poll_complete_work);
req->events = demangle_poll(iocb->aio_buf) | EPOLLERR | EPOLLHUP;
- req->file = fget(iocb->aio_fildes);
- if (unlikely(!req->file))
- return -EBADF;
req->head = NULL;
req->woken = false;
@@ -1788,10 +1771,8 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
spin_unlock_irq(&ctx->ctx_lock);
out:
- if (unlikely(apt.error)) {
- fput(req->file);
+ if (unlikely(apt.error))
return apt.error;
- }
if (mask)
aio_poll_complete(aiocb, mask);
@@ -1829,6 +1810,11 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb,
if (unlikely(!req))
goto out_put_reqs_available;
+ req->ki_filp = fget(iocb->aio_fildes);
+ ret = -EBADF;
+ if (unlikely(!req->ki_filp))
+ goto out_put_req;
+
if (iocb->aio_flags & IOCB_FLAG_RESFD) {
/*
* If the IOCB_FLAG_RESFD flag of aio_flags is set, get an
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7b6084854bfe..111c94c4baa1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -304,13 +304,19 @@ enum rw_hint {
struct kiocb {
struct file *ki_filp;
+
+ /* The 'ki_filp' pointer is shared in a union for aio */
+ randomized_struct_fields_start
+
loff_t ki_pos;
void (*ki_complete)(struct kiocb *iocb, long ret, long ret2);
void *private;
int ki_flags;
u16 ki_hint;
u16 ki_ioprio; /* See linux/ioprio.h */
-} __randomize_layout;
+
+ randomized_struct_fields_end
+};
static inline bool is_sync_kiocb(struct kiocb *kiocb)
{

View File

@ -1,112 +0,0 @@
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 6 Mar 2019 20:22:54 -0500
Subject: [10/14] pin iocb through aio.
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=c7f2525abfecf8a57a1417837b6a809df79b299e
commit b53119f13a04879c3bf502828d99d13726639ead upstream.
aio_poll() is not the only case that needs file pinned; worse, while
aio_read()/aio_write() can live without pinning iocb itself, the
proof is rather brittle and can easily break on later changes.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 37 +++++++++++++++++++++----------------
1 file changed, 21 insertions(+), 16 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 46229e663b57..10e5a8f52dce 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1016,6 +1016,9 @@ static bool get_reqs_available(struct kioctx *ctx)
/* aio_get_req
* Allocate a slot for an aio request.
* Returns NULL if no requests are free.
+ *
+ * The refcount is initialized to 2 - one for the async op completion,
+ * one for the synchronous code that does this.
*/
static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx)
{
@@ -1028,7 +1031,7 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx)
percpu_ref_get(&ctx->reqs);
req->ki_ctx = ctx;
INIT_LIST_HEAD(&req->ki_list);
- refcount_set(&req->ki_refcnt, 0);
+ refcount_set(&req->ki_refcnt, 2);
req->ki_eventfd = NULL;
return req;
}
@@ -1061,15 +1064,18 @@ static struct kioctx *lookup_ioctx(unsigned long ctx_id)
return ret;
}
+static inline void iocb_destroy(struct aio_kiocb *iocb)
+{
+ if (iocb->ki_filp)
+ fput(iocb->ki_filp);
+ percpu_ref_put(&iocb->ki_ctx->reqs);
+ kmem_cache_free(kiocb_cachep, iocb);
+}
+
static inline void iocb_put(struct aio_kiocb *iocb)
{
- if (refcount_read(&iocb->ki_refcnt) == 0 ||
- refcount_dec_and_test(&iocb->ki_refcnt)) {
- if (iocb->ki_filp)
- fput(iocb->ki_filp);
- percpu_ref_put(&iocb->ki_ctx->reqs);
- kmem_cache_free(kiocb_cachep, iocb);
- }
+ if (refcount_dec_and_test(&iocb->ki_refcnt))
+ iocb_destroy(iocb);
}
static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb,
@@ -1743,9 +1749,6 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
INIT_LIST_HEAD(&req->wait.entry);
init_waitqueue_func_entry(&req->wait, aio_poll_wake);
- /* one for removal from waitqueue, one for this function */
- refcount_set(&aiocb->ki_refcnt, 2);
-
mask = vfs_poll(req->file, &apt.pt) & req->events;
if (unlikely(!req->head)) {
/* we did not manage to set up a waitqueue, done */
@@ -1776,7 +1779,6 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
if (mask)
aio_poll_complete(aiocb, mask);
- iocb_put(aiocb);
return 0;
}
@@ -1867,18 +1869,21 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb,
break;
}
+ /* Done with the synchronous reference */
+ iocb_put(req);
+
/*
* If ret is 0, we'd either done aio_complete() ourselves or have
* arranged for that to be done asynchronously. Anything non-zero
* means that we need to destroy req ourselves.
*/
- if (ret)
- goto out_put_req;
- return 0;
+ if (!ret)
+ return 0;
+
out_put_req:
if (req->ki_eventfd)
eventfd_ctx_put(req->ki_eventfd);
- iocb_put(req);
+ iocb_destroy(req);
out_put_reqs_available:
put_reqs_available(ctx, 1);
return ret;

View File

@ -1,61 +0,0 @@
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Mon, 11 Mar 2019 19:00:36 -0400
Subject: [11/14] aio: fold lookup_kiocb() into its sole caller
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=592ea630b081a6c97ec56499b0e12f68fd2da2d8
commit 833f4154ed560232120bc475935ee1d6a20e159f upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 29 +++++++----------------------
1 file changed, 7 insertions(+), 22 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 10e5a8f52dce..cda193f6de76 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1992,24 +1992,6 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id,
}
#endif
-/* lookup_kiocb
- * Finds a given iocb for cancellation.
- */
-static struct aio_kiocb *
-lookup_kiocb(struct kioctx *ctx, struct iocb __user *iocb)
-{
- struct aio_kiocb *kiocb;
-
- assert_spin_locked(&ctx->ctx_lock);
-
- /* TODO: use a hash or array, this sucks. */
- list_for_each_entry(kiocb, &ctx->active_reqs, ki_list) {
- if (kiocb->ki_user_iocb == iocb)
- return kiocb;
- }
- return NULL;
-}
-
/* sys_io_cancel:
* Attempts to cancel an iocb previously passed to io_submit. If
* the operation is successfully cancelled, the resulting event is
@@ -2038,10 +2020,13 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
return -EINVAL;
spin_lock_irq(&ctx->ctx_lock);
- kiocb = lookup_kiocb(ctx, iocb);
- if (kiocb) {
- ret = kiocb->ki_cancel(&kiocb->rw);
- list_del_init(&kiocb->ki_list);
+ /* TODO: use a hash or array, this sucks. */
+ list_for_each_entry(kiocb, &ctx->active_reqs, ki_list) {
+ if (kiocb->ki_user_iocb == iocb) {
+ ret = kiocb->ki_cancel(&kiocb->rw);
+ list_del_init(&kiocb->ki_list);
+ break;
+ }
}
spin_unlock_irq(&ctx->ctx_lock);

View File

@ -1,105 +0,0 @@
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Thu, 7 Mar 2019 19:43:45 -0500
Subject: [12/14] aio: keep io_event in aio_kiocb
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=c20202c51d2b6703a4e539235f892f34daabd791
commit a9339b7855094ba11a97e8822ae038135e879e79 upstream.
We want to separate forming the resulting io_event from putting it
into the ring buffer.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 31 +++++++++++++------------------
1 file changed, 13 insertions(+), 18 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index cda193f6de76..ec30f1bdac0c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -198,8 +198,7 @@ struct aio_kiocb {
struct kioctx *ki_ctx;
kiocb_cancel_fn *ki_cancel;
- struct iocb __user *ki_user_iocb; /* user's aiocb */
- __u64 ki_user_data; /* user's data for completion */
+ struct io_event ki_res;
struct list_head ki_list; /* the aio core uses this
* for cancellation */
@@ -1078,15 +1077,6 @@ static inline void iocb_put(struct aio_kiocb *iocb)
iocb_destroy(iocb);
}
-static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb,
- long res, long res2)
-{
- ev->obj = (u64)(unsigned long)iocb->ki_user_iocb;
- ev->data = iocb->ki_user_data;
- ev->res = res;
- ev->res2 = res2;
-}
-
/* aio_complete
* Called when the io request on the given iocb is complete.
*/
@@ -1098,6 +1088,8 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2)
unsigned tail, pos, head;
unsigned long flags;
+ iocb->ki_res.res = res;
+ iocb->ki_res.res2 = res2;
/*
* Add a completion event to the ring buffer. Must be done holding
* ctx->completion_lock to prevent other code from messing with the tail
@@ -1114,14 +1106,14 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2)
ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
event = ev_page + pos % AIO_EVENTS_PER_PAGE;
- aio_fill_event(event, iocb, res, res2);
+ *event = iocb->ki_res;
kunmap_atomic(ev_page);
flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
- pr_debug("%p[%u]: %p: %p %Lx %lx %lx\n",
- ctx, tail, iocb, iocb->ki_user_iocb, iocb->ki_user_data,
- res, res2);
+ pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb,
+ (void __user *)(unsigned long)iocb->ki_res.obj,
+ iocb->ki_res.data, iocb->ki_res.res, iocb->ki_res.res2);
/* after flagging the request as done, we
* must never even look at it again
@@ -1838,8 +1830,10 @@ static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb,
goto out_put_req;
}
- req->ki_user_iocb = user_iocb;
- req->ki_user_data = iocb->aio_data;
+ req->ki_res.obj = (u64)(unsigned long)user_iocb;
+ req->ki_res.data = iocb->aio_data;
+ req->ki_res.res = 0;
+ req->ki_res.res2 = 0;
switch (iocb->aio_lio_opcode) {
case IOCB_CMD_PREAD:
@@ -2009,6 +2003,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
struct aio_kiocb *kiocb;
int ret = -EINVAL;
u32 key;
+ u64 obj = (u64)(unsigned long)iocb;
if (unlikely(get_user(key, &iocb->aio_key)))
return -EFAULT;
@@ -2022,7 +2017,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
spin_lock_irq(&ctx->ctx_lock);
/* TODO: use a hash or array, this sucks. */
list_for_each_entry(kiocb, &ctx->active_reqs, ki_list) {
- if (kiocb->ki_user_iocb == iocb) {
+ if (kiocb->ki_res.obj == obj) {
ret = kiocb->ki_cancel(&kiocb->rw);
list_del_init(&kiocb->ki_list);
break;

View File

@ -1,101 +0,0 @@
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Thu, 7 Mar 2019 19:49:55 -0500
Subject: [13/14] aio: store event at final iocb_put()
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=aab66dfb757aa5b211ec6b0c322b42f4ef5ab34f
commit 2bb874c0d873d13bd9b9b9c6d7b7c4edab18c8b4 upstream.
Instead of having aio_complete() set ->ki_res.{res,res2}, do that
explicitly in its callers, drop the reference (as aio_complete()
used to do) and delay the rest until the final iocb_put().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index ec30f1bdac0c..556ee620038f 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1071,16 +1071,10 @@ static inline void iocb_destroy(struct aio_kiocb *iocb)
kmem_cache_free(kiocb_cachep, iocb);
}
-static inline void iocb_put(struct aio_kiocb *iocb)
-{
- if (refcount_dec_and_test(&iocb->ki_refcnt))
- iocb_destroy(iocb);
-}
-
/* aio_complete
* Called when the io request on the given iocb is complete.
*/
-static void aio_complete(struct aio_kiocb *iocb, long res, long res2)
+static void aio_complete(struct aio_kiocb *iocb)
{
struct kioctx *ctx = iocb->ki_ctx;
struct aio_ring *ring;
@@ -1088,8 +1082,6 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2)
unsigned tail, pos, head;
unsigned long flags;
- iocb->ki_res.res = res;
- iocb->ki_res.res2 = res2;
/*
* Add a completion event to the ring buffer. Must be done holding
* ctx->completion_lock to prevent other code from messing with the tail
@@ -1155,7 +1147,14 @@ static void aio_complete(struct aio_kiocb *iocb, long res, long res2)
if (waitqueue_active(&ctx->wait))
wake_up(&ctx->wait);
- iocb_put(iocb);
+}
+
+static inline void iocb_put(struct aio_kiocb *iocb)
+{
+ if (refcount_dec_and_test(&iocb->ki_refcnt)) {
+ aio_complete(iocb);
+ iocb_destroy(iocb);
+ }
}
/* aio_read_events_ring
@@ -1429,7 +1428,9 @@ static void aio_complete_rw(struct kiocb *kiocb, long res, long res2)
file_end_write(kiocb->ki_filp);
}
- aio_complete(iocb, res, res2);
+ iocb->ki_res.res = res;
+ iocb->ki_res.res2 = res2;
+ iocb_put(iocb);
}
static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
@@ -1577,11 +1578,10 @@ static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb,
static void aio_fsync_work(struct work_struct *work)
{
- struct fsync_iocb *req = container_of(work, struct fsync_iocb, work);
- int ret;
+ struct aio_kiocb *iocb = container_of(work, struct aio_kiocb, fsync.work);
- ret = vfs_fsync(req->file, req->datasync);
- aio_complete(container_of(req, struct aio_kiocb, fsync), ret, 0);
+ iocb->ki_res.res = vfs_fsync(iocb->fsync.file, iocb->fsync.datasync);
+ iocb_put(iocb);
}
static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
@@ -1602,7 +1602,8 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
static inline void aio_poll_complete(struct aio_kiocb *iocb, __poll_t mask)
{
- aio_complete(iocb, mangle_poll(mask), 0);
+ iocb->ki_res.res = mangle_poll(mask);
+ iocb_put(iocb);
}
static void aio_poll_complete_work(struct work_struct *work)

View File

@ -1,225 +0,0 @@
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Thu, 7 Mar 2019 21:45:41 -0500
Subject: [14/14] Fix aio_poll() races
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=e9e47779aaa7212ccb75f8d8d4d16ab188efb313
commit af5c72b1fc7a00aa484e90b0c4e0eeb582545634 upstream.
aio_poll() has to cope with several unpleasant problems:
* requests that might stay around indefinitely need to
be made visible for io_cancel(2); that must not be done to
a request already completed, though.
* in cases when ->poll() has placed us on a waitqueue,
wakeup might have happened (and request completed) before ->poll()
returns.
* worse, in some early wakeup cases request might end
up re-added into the queue later - we can't treat "woken up and
currently not in the queue" as "it's not going to stick around
indefinitely"
* ... moreover, ->poll() might have decided not to
put it on any queues to start with, and that needs to be distinguished
from the previous case
* ->poll() might have tried to put us on more than one queue.
Only the first will succeed for aio poll, so we might end up missing
wakeups. OTOH, we might very well notice that only after the
wakeup hits and request gets completed (all before ->poll() gets
around to the second poll_wait()). In that case it's too late to
decide that we have an error.
req->woken was an attempt to deal with that. Unfortunately, it was
broken. What we need to keep track of is not that wakeup has happened -
the thing might come back after that. It's that async reference is
already gone and won't come back, so we can't (and needn't) put the
request on the list of cancellables.
The easiest case is "request hadn't been put on any waitqueues"; we
can tell by seeing NULL apt.head, and in that case there won't be
anything async. We should either complete the request ourselves
(if vfs_poll() reports anything of interest) or return an error.
In all other cases we get exclusion with wakeups by grabbing the
queue lock.
If request is currently on queue and we have something interesting
from vfs_poll(), we can steal it and complete the request ourselves.
If it's on queue and vfs_poll() has not reported anything interesting,
we either put it on the cancellable list, or, if we know that it
hadn't been put on all queues ->poll() wanted it on, we steal it and
return an error.
If it's _not_ on queue, it's either been already dealt with (in which
case we do nothing), or there's aio_poll_complete_work() about to be
executed. In that case we either put it on the cancellable list,
or, if we know it hadn't been put on all queues ->poll() wanted it on,
simulate what cancel would've done.
It's a lot more convoluted than I'd like it to be. Single-consumer APIs
suck, and unfortunately aio is not an exception...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 90 +++++++++++++++++++++++++-------------------------------
1 file changed, 40 insertions(+), 50 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 556ee620038f..911e23087dfb 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -175,7 +175,7 @@ struct poll_iocb {
struct file *file;
struct wait_queue_head *head;
__poll_t events;
- bool woken;
+ bool done;
bool cancelled;
struct wait_queue_entry wait;
struct work_struct work;
@@ -1600,12 +1600,6 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
return 0;
}
-static inline void aio_poll_complete(struct aio_kiocb *iocb, __poll_t mask)
-{
- iocb->ki_res.res = mangle_poll(mask);
- iocb_put(iocb);
-}
-
static void aio_poll_complete_work(struct work_struct *work)
{
struct poll_iocb *req = container_of(work, struct poll_iocb, work);
@@ -1631,9 +1625,11 @@ static void aio_poll_complete_work(struct work_struct *work)
return;
}
list_del_init(&iocb->ki_list);
+ iocb->ki_res.res = mangle_poll(mask);
+ req->done = true;
spin_unlock_irq(&ctx->ctx_lock);
- aio_poll_complete(iocb, mask);
+ iocb_put(iocb);
}
/* assumes we are called with irqs disabled */
@@ -1661,31 +1657,27 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
__poll_t mask = key_to_poll(key);
unsigned long flags;
- req->woken = true;
-
/* for instances that support it check for an event match first: */
- if (mask) {
- if (!(mask & req->events))
- return 0;
+ if (mask && !(mask & req->events))
+ return 0;
+ list_del_init(&req->wait.entry);
+
+ if (mask && spin_trylock_irqsave(&iocb->ki_ctx->ctx_lock, flags)) {
/*
* Try to complete the iocb inline if we can. Use
* irqsave/irqrestore because not all filesystems (e.g. fuse)
* call this function with IRQs disabled and because IRQs
* have to be disabled before ctx_lock is obtained.
*/
- if (spin_trylock_irqsave(&iocb->ki_ctx->ctx_lock, flags)) {
- list_del(&iocb->ki_list);
- spin_unlock_irqrestore(&iocb->ki_ctx->ctx_lock, flags);
-
- list_del_init(&req->wait.entry);
- aio_poll_complete(iocb, mask);
- return 1;
- }
+ list_del(&iocb->ki_list);
+ iocb->ki_res.res = mangle_poll(mask);
+ req->done = true;
+ spin_unlock_irqrestore(&iocb->ki_ctx->ctx_lock, flags);
+ iocb_put(iocb);
+ } else {
+ schedule_work(&req->work);
}
-
- list_del_init(&req->wait.entry);
- schedule_work(&req->work);
return 1;
}
@@ -1717,6 +1709,7 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
struct kioctx *ctx = aiocb->ki_ctx;
struct poll_iocb *req = &aiocb->poll;
struct aio_poll_table apt;
+ bool cancel = false;
__poll_t mask;
/* reject any unknown events outside the normal event mask. */
@@ -1730,7 +1723,7 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
req->events = demangle_poll(iocb->aio_buf) | EPOLLERR | EPOLLHUP;
req->head = NULL;
- req->woken = false;
+ req->done = false;
req->cancelled = false;
apt.pt._qproc = aio_poll_queue_proc;
@@ -1743,36 +1736,33 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
init_waitqueue_func_entry(&req->wait, aio_poll_wake);
mask = vfs_poll(req->file, &apt.pt) & req->events;
- if (unlikely(!req->head)) {
- /* we did not manage to set up a waitqueue, done */
- goto out;
- }
-
spin_lock_irq(&ctx->ctx_lock);
- spin_lock(&req->head->lock);
- if (req->woken) {
- /* wake_up context handles the rest */
- mask = 0;
+ if (likely(req->head)) {
+ spin_lock(&req->head->lock);
+ if (unlikely(list_empty(&req->wait.entry))) {
+ if (apt.error)
+ cancel = true;
+ apt.error = 0;
+ mask = 0;
+ }
+ if (mask || apt.error) {
+ list_del_init(&req->wait.entry);
+ } else if (cancel) {
+ WRITE_ONCE(req->cancelled, true);
+ } else if (!req->done) { /* actually waiting for an event */
+ list_add_tail(&aiocb->ki_list, &ctx->active_reqs);
+ aiocb->ki_cancel = aio_poll_cancel;
+ }
+ spin_unlock(&req->head->lock);
+ }
+ if (mask) { /* no async, we'd stolen it */
+ aiocb->ki_res.res = mangle_poll(mask);
apt.error = 0;
- } else if (mask || apt.error) {
- /* if we get an error or a mask we are done */
- WARN_ON_ONCE(list_empty(&req->wait.entry));
- list_del_init(&req->wait.entry);
- } else {
- /* actually waiting for an event */
- list_add_tail(&aiocb->ki_list, &ctx->active_reqs);
- aiocb->ki_cancel = aio_poll_cancel;
}
- spin_unlock(&req->head->lock);
spin_unlock_irq(&ctx->ctx_lock);
-
-out:
- if (unlikely(apt.error))
- return apt.error;
-
if (mask)
- aio_poll_complete(aiocb, mask);
- return 0;
+ iocb_put(aiocb);
+ return apt.error;
}
static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb,

View File

@ -1,152 +0,0 @@
From: Vladis Dronov <vdronov@redhat.com>
Date: Tue, 30 Jul 2019 11:33:45 +0200
Subject: Bluetooth: hci_uart: check for missing tty operations
Origin: https://git.kernel.org/linus/b36a1552d7319bbfd5cf7f08726c23c5c66d4f73
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-10207
commit b36a1552d7319bbfd5cf7f08726c23c5c66d4f73 upstream.
Certain ttys operations (pty_unix98_ops) lack tiocmget() and tiocmset()
functions which are called by the certain HCI UART protocols (hci_ath,
hci_bcm, hci_intel, hci_mrvl, hci_qca) via hci_uart_set_flow_control()
or directly. This leads to an execution at NULL and can be triggered by
an unprivileged user. Fix this by adding a helper function and a check
for the missing tty operations in the protocols code.
This fixes CVE-2019-10207. The Fixes: lines list commits where calls to
tiocm[gs]et() or hci_uart_set_flow_control() were added to the HCI UART
protocols.
Link: https://syzkaller.appspot.com/bug?id=1b42faa2848963564a5b1b7f8c837ea7b55ffa50
Reported-by: syzbot+79337b501d6aa974d0f6@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org # v2.6.36+
Fixes: b3190df62861 ("Bluetooth: Support for Atheros AR300x serial chip")
Fixes: 118612fb9165 ("Bluetooth: hci_bcm: Add suspend/resume PM functions")
Fixes: ff2895592f0f ("Bluetooth: hci_intel: Add Intel baudrate configuration support")
Fixes: 162f812f23ba ("Bluetooth: hci_uart: Add Marvell support")
Fixes: fa9ad876b8e0 ("Bluetooth: hci_qca: Add support for Qualcomm Bluetooth chip wcn3990")
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Yu-Chen, Cho <acho@suse.com>
Tested-by: Yu-Chen, Cho <acho@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/bluetooth/hci_ath.c | 3 +++
drivers/bluetooth/hci_bcm.c | 3 +++
drivers/bluetooth/hci_intel.c | 3 +++
drivers/bluetooth/hci_ldisc.c | 13 +++++++++++++
drivers/bluetooth/hci_mrvl.c | 3 +++
drivers/bluetooth/hci_qca.c | 3 +++
drivers/bluetooth/hci_uart.h | 1 +
7 files changed, 29 insertions(+)
diff --git a/drivers/bluetooth/hci_ath.c b/drivers/bluetooth/hci_ath.c
index d568fbd94d6c..20235925344d 100644
--- a/drivers/bluetooth/hci_ath.c
+++ b/drivers/bluetooth/hci_ath.c
@@ -112,6 +112,9 @@ static int ath_open(struct hci_uart *hu)
BT_DBG("hu %p", hu);
+ if (!hci_uart_has_flow_control(hu))
+ return -EOPNOTSUPP;
+
ath = kzalloc(sizeof(*ath), GFP_KERNEL);
if (!ath)
return -ENOMEM;
diff --git a/drivers/bluetooth/hci_bcm.c b/drivers/bluetooth/hci_bcm.c
index 800132369134..aa6b7ed9fdf1 100644
--- a/drivers/bluetooth/hci_bcm.c
+++ b/drivers/bluetooth/hci_bcm.c
@@ -369,6 +369,9 @@ static int bcm_open(struct hci_uart *hu)
bt_dev_dbg(hu->hdev, "hu %p", hu);
+ if (!hci_uart_has_flow_control(hu))
+ return -EOPNOTSUPP;
+
bcm = kzalloc(sizeof(*bcm), GFP_KERNEL);
if (!bcm)
return -ENOMEM;
diff --git a/drivers/bluetooth/hci_intel.c b/drivers/bluetooth/hci_intel.c
index 46ace321bf60..e9228520e4c7 100644
--- a/drivers/bluetooth/hci_intel.c
+++ b/drivers/bluetooth/hci_intel.c
@@ -406,6 +406,9 @@ static int intel_open(struct hci_uart *hu)
BT_DBG("hu %p", hu);
+ if (!hci_uart_has_flow_control(hu))
+ return -EOPNOTSUPP;
+
intel = kzalloc(sizeof(*intel), GFP_KERNEL);
if (!intel)
return -ENOMEM;
diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
index c915daf01a89..efeb8137ec67 100644
--- a/drivers/bluetooth/hci_ldisc.c
+++ b/drivers/bluetooth/hci_ldisc.c
@@ -299,6 +299,19 @@ static int hci_uart_send_frame(struct hci_dev *hdev, struct sk_buff *skb)
return 0;
}
+/* Check the underlying device or tty has flow control support */
+bool hci_uart_has_flow_control(struct hci_uart *hu)
+{
+ /* serdev nodes check if the needed operations are present */
+ if (hu->serdev)
+ return true;
+
+ if (hu->tty->driver->ops->tiocmget && hu->tty->driver->ops->tiocmset)
+ return true;
+
+ return false;
+}
+
/* Flow control or un-flow control the device */
void hci_uart_set_flow_control(struct hci_uart *hu, bool enable)
{
diff --git a/drivers/bluetooth/hci_mrvl.c b/drivers/bluetooth/hci_mrvl.c
index ffb00669346f..23791df081ba 100644
--- a/drivers/bluetooth/hci_mrvl.c
+++ b/drivers/bluetooth/hci_mrvl.c
@@ -66,6 +66,9 @@ static int mrvl_open(struct hci_uart *hu)
BT_DBG("hu %p", hu);
+ if (!hci_uart_has_flow_control(hu))
+ return -EOPNOTSUPP;
+
mrvl = kzalloc(sizeof(*mrvl), GFP_KERNEL);
if (!mrvl)
return -ENOMEM;
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 77004c29da08..f96e58de049b 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -450,6 +450,9 @@ static int qca_open(struct hci_uart *hu)
BT_DBG("hu %p qca_open", hu);
+ if (!hci_uart_has_flow_control(hu))
+ return -EOPNOTSUPP;
+
qca = kzalloc(sizeof(struct qca_data), GFP_KERNEL);
if (!qca)
return -ENOMEM;
diff --git a/drivers/bluetooth/hci_uart.h b/drivers/bluetooth/hci_uart.h
index 00cab2fd7a1b..067a610f1372 100644
--- a/drivers/bluetooth/hci_uart.h
+++ b/drivers/bluetooth/hci_uart.h
@@ -118,6 +118,7 @@ int hci_uart_tx_wakeup(struct hci_uart *hu);
int hci_uart_init_ready(struct hci_uart *hu);
void hci_uart_init_work(struct work_struct *work);
void hci_uart_set_baudrate(struct hci_uart *hu, unsigned int speed);
+bool hci_uart_has_flow_control(struct hci_uart *hu);
void hci_uart_set_flow_control(struct hci_uart *hu, bool enable);
void hci_uart_set_speeds(struct hci_uart *hu, unsigned int init_speed,
unsigned int oper_speed);
--
cgit 1.2-0.3.lf.el7

View File

@ -1,34 +0,0 @@
From: Young Xiao <YangX92@hotmail.com>
Date: Fri, 12 Apr 2019 15:24:30 +0800
Subject: Bluetooth: hidp: fix buffer overflow
Origin: https://git.kernel.org/linus/a1616a5ac99ede5d605047a9012481ce7ff18b16
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-11884
Struct ca is copied from userspace. It is not checked whether the "name"
field is NULL terminated, which allows local users to obtain potentially
sensitive information from kernel stack memory, via a HIDPCONNADD command.
This vulnerability is similar to CVE-2011-1079.
Signed-off-by: Young Xiao <YangX92@hotmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org
---
net/bluetooth/hidp/sock.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/bluetooth/hidp/sock.c b/net/bluetooth/hidp/sock.c
index 9f85a1943be9..2151913892ce 100644
--- a/net/bluetooth/hidp/sock.c
+++ b/net/bluetooth/hidp/sock.c
@@ -75,6 +75,7 @@ static int do_hidp_sock_ioctl(struct socket *sock, unsigned int cmd, void __user
sockfd_put(csock);
return err;
}
+ ca.name[sizeof(ca.name)-1] = 0;
err = hidp_connection_add(&ca, csock, isock);
if (!err && copy_to_user(argp, &ca, sizeof(ca)))
--
2.20.1

View File

@ -1,760 +0,0 @@
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Thu, 20 Jun 2019 16:10:50 -0700
Subject: Documentation: Add section about CPU vulnerabilities for Spectre
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8a815007f5fe292fa8ef082663e1259b9ae0571b
commit 6e88559470f581741bcd0f2794f9054814ac9740 upstream.
Add documentation for Spectre vulnerability and the mitigation mechanisms:
- Explain the problem and risks
- Document the mitigation mechanisms
- Document the command line controls
- Document the sysfs files
Co-developed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
Documentation/admin-guide/hw-vuln/index.rst | 1 +
Documentation/admin-guide/hw-vuln/spectre.rst | 697 ++++++++++++++++++
Documentation/userspace-api/spec_ctrl.rst | 2 +
3 files changed, 700 insertions(+)
create mode 100644 Documentation/admin-guide/hw-vuln/spectre.rst
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index ffc064c1ec68..49311f3da6f2 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -9,5 +9,6 @@ are configurable at compile, boot or run time.
.. toctree::
:maxdepth: 1
+ spectre
l1tf
mds
diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
new file mode 100644
index 000000000000..25f3b2532198
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -0,0 +1,697 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Spectre Side Channels
+=====================
+
+Spectre is a class of side channel attacks that exploit branch prediction
+and speculative execution on modern CPUs to read memory, possibly
+bypassing access controls. Speculative execution side channel exploits
+do not modify memory but attempt to infer privileged data in the memory.
+
+This document covers Spectre variant 1 and Spectre variant 2.
+
+Affected processors
+-------------------
+
+Speculative execution side channel methods affect a wide range of modern
+high performance processors, since most modern high speed processors
+use branch prediction and speculative execution.
+
+The following CPUs are vulnerable:
+
+ - Intel Core, Atom, Pentium, and Xeon processors
+
+ - AMD Phenom, EPYC, and Zen processors
+
+ - IBM POWER and zSeries processors
+
+ - Higher end ARM processors
+
+ - Apple CPUs
+
+ - Higher end MIPS CPUs
+
+ - Likely most other high performance CPUs. Contact your CPU vendor for details.
+
+Whether a processor is affected or not can be read out from the Spectre
+vulnerability files in sysfs. See :ref:`spectre_sys_info`.
+
+Related CVEs
+------------
+
+The following CVE entries describe Spectre variants:
+
+ ============= ======================= =================
+ CVE-2017-5753 Bounds check bypass Spectre variant 1
+ CVE-2017-5715 Branch target injection Spectre variant 2
+ ============= ======================= =================
+
+Problem
+-------
+
+CPUs use speculative operations to improve performance. That may leave
+traces of memory accesses or computations in the processor's caches,
+buffers, and branch predictors. Malicious software may be able to
+influence the speculative execution paths, and then use the side effects
+of the speculative execution in the CPUs' caches and buffers to infer
+privileged data touched during the speculative execution.
+
+Spectre variant 1 attacks take advantage of speculative execution of
+conditional branches, while Spectre variant 2 attacks use speculative
+execution of indirect branches to leak privileged memory.
+See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[7] <spec_ref7>`
+:ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`.
+
+Spectre variant 1 (Bounds Check Bypass)
+---------------------------------------
+
+The bounds check bypass attack :ref:`[2] <spec_ref2>` takes advantage
+of speculative execution that bypasses conditional branch instructions
+used for memory access bounds check (e.g. checking if the index of an
+array results in memory access within a valid range). This results in
+memory accesses to invalid memory (with out-of-bound index) that are
+done speculatively before validation checks resolve. Such speculative
+memory accesses can leave side effects, creating side channels which
+leak information to the attacker.
+
+There are some extensions of Spectre variant 1 attacks for reading data
+over the network, see :ref:`[12] <spec_ref12>`. However such attacks
+are difficult, low bandwidth, fragile, and are considered low risk.
+
+Spectre variant 2 (Branch Target Injection)
+-------------------------------------------
+
+The branch target injection attack takes advantage of speculative
+execution of indirect branches :ref:`[3] <spec_ref3>`. The indirect
+branch predictors inside the processor used to guess the target of
+indirect branches can be influenced by an attacker, causing gadget code
+to be speculatively executed, thus exposing sensitive data touched by
+the victim. The side effects left in the CPU's caches during speculative
+execution can be measured to infer data values.
+
+.. _poison_btb:
+
+In Spectre variant 2 attacks, the attacker can steer speculative indirect
+branches in the victim to gadget code by poisoning the branch target
+buffer of a CPU used for predicting indirect branch addresses. Such
+poisoning could be done by indirect branching into existing code,
+with the address offset of the indirect branch under the attacker's
+control. Since the branch prediction on impacted hardware does not
+fully disambiguate branch address and uses the offset for prediction,
+this could cause privileged code's indirect branch to jump to a gadget
+code with the same offset.
+
+The most useful gadgets take an attacker-controlled input parameter (such
+as a register value) so that the memory read can be controlled. Gadgets
+without input parameters might be possible, but the attacker would have
+very little control over what memory can be read, reducing the risk of
+the attack revealing useful data.
+
+One other variant 2 attack vector is for the attacker to poison the
+return stack buffer (RSB) :ref:`[13] <spec_ref13>` to cause speculative
+subroutine return instruction execution to go to a gadget. An attacker's
+imbalanced subroutine call instructions might "poison" entries in the
+return stack buffer which are later consumed by a victim's subroutine
+return instructions. This attack can be mitigated by flushing the return
+stack buffer on context switch, or virtual machine (VM) exit.
+
+On systems with simultaneous multi-threading (SMT), attacks are possible
+from the sibling thread, as level 1 cache and branch target buffer
+(BTB) may be shared between hardware threads in a CPU core. A malicious
+program running on the sibling thread may influence its peer's BTB to
+steer its indirect branch speculations to gadget code, and measure the
+speculative execution's side effects left in level 1 cache to infer the
+victim's data.
+
+Attack scenarios
+----------------
+
+The following list of attack scenarios have been anticipated, but may
+not cover all possible attack vectors.
+
+1. A user process attacking the kernel
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The attacker passes a parameter to the kernel via a register or
+ via a known address in memory during a syscall. Such parameter may
+ be used later by the kernel as an index to an array or to derive
+ a pointer for a Spectre variant 1 attack. The index or pointer
+ is invalid, but bound checks are bypassed in the code branch taken
+ for speculative execution. This could cause privileged memory to be
+ accessed and leaked.
+
+ For kernel code that has been identified where data pointers could
+ potentially be influenced for Spectre attacks, new "nospec" accessor
+ macros are used to prevent speculative loading of data.
+
+ Spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
+ target buffer (BTB) before issuing syscall to launch an attack.
+ After entering the kernel, the kernel could use the poisoned branch
+ target buffer on indirect jump and jump to gadget code in speculative
+ execution.
+
+ If an attacker tries to control the memory addresses leaked during
+ speculative execution, he would also need to pass a parameter to the
+ gadget, either through a register or a known address in memory. After
+ the gadget has executed, he can measure the side effect.
+
+ The kernel can protect itself against consuming poisoned branch
+ target buffer entries by using return trampolines (also known as
+ "retpoline") :ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` for all
+ indirect branches. Return trampolines trap speculative execution paths
+ to prevent jumping to gadget code during speculative execution.
+ x86 CPUs with Enhanced Indirect Branch Restricted Speculation
+ (Enhanced IBRS) available in hardware should use the feature to
+ mitigate Spectre variant 2 instead of retpoline. Enhanced IBRS is
+ more efficient than retpoline.
+
+ There may be gadget code in firmware which could be exploited with
+ Spectre variant 2 attack by a rogue user process. To mitigate such
+ attacks on x86, Indirect Branch Restricted Speculation (IBRS) feature
+ is turned on before the kernel invokes any firmware code.
+
+2. A user process attacking another user process
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ A malicious user process can try to attack another user process,
+ either via a context switch on the same hardware thread, or from the
+ sibling hyperthread sharing a physical processor core on simultaneous
+ multi-threading (SMT) system.
+
+ Spectre variant 1 attacks generally require passing parameters
+ between the processes, which needs a data passing relationship, such
+ as remote procedure calls (RPC). Those parameters are used in gadget
+ code to derive invalid data pointers accessing privileged memory in
+ the attacked process.
+
+ Spectre variant 2 attacks can be launched from a rogue process by
+ :ref:`poisoning <poison_btb>` the branch target buffer. This can
+ influence the indirect branch targets for a victim process that either
+ runs later on the same hardware thread, or running concurrently on
+ a sibling hardware thread sharing the same physical core.
+
+ A user process can protect itself against Spectre variant 2 attacks
+ by using the prctl() syscall to disable indirect branch speculation
+ for itself. An administrator can also cordon off an unsafe process
+ from polluting the branch target buffer by disabling the process's
+ indirect branch speculation. This comes with a performance cost
+ from not using indirect branch speculation and clearing the branch
+ target buffer. When SMT is enabled on x86, for a process that has
+ indirect branch speculation disabled, Single Threaded Indirect Branch
+ Predictors (STIBP) :ref:`[4] <spec_ref4>` are turned on to prevent the
+ sibling thread from controlling branch target buffer. In addition,
+ the Indirect Branch Prediction Barrier (IBPB) is issued to clear the
+ branch target buffer when context switching to and from such process.
+
+ On x86, the return stack buffer is stuffed on context switch.
+ This prevents the branch target buffer from being used for branch
+ prediction when the return stack buffer underflows while switching to
+ a deeper call stack. Any poisoned entries in the return stack buffer
+ left by the previous process will also be cleared.
+
+ User programs should use address space randomization to make attacks
+ more difficult (Set /proc/sys/kernel/randomize_va_space = 1 or 2).
+
+3. A virtualized guest attacking the host
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The attack mechanism is similar to how user processes attack the
+ kernel. The kernel is entered via hyper-calls or other virtualization
+ exit paths.
+
+ For Spectre variant 1 attacks, rogue guests can pass parameters
+ (e.g. in registers) via hyper-calls to derive invalid pointers to
+ speculate into privileged memory after entering the kernel. For places
+ where such kernel code has been identified, nospec accessor macros
+ are used to stop speculative memory access.
+
+ For Spectre variant 2 attacks, rogue guests can :ref:`poison
+ <poison_btb>` the branch target buffer or return stack buffer, causing
+ the kernel to jump to gadget code in the speculative execution paths.
+
+ To mitigate variant 2, the host kernel can use return trampolines
+ for indirect branches to bypass the poisoned branch target buffer,
+ and flushing the return stack buffer on VM exit. This prevents rogue
+ guests from affecting indirect branching in the host kernel.
+
+ To protect host processes from rogue guests, host processes can have
+ indirect branch speculation disabled via prctl(). The branch target
+ buffer is cleared before context switching to such processes.
+
+4. A virtualized guest attacking other guest
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ A rogue guest may attack another guest to get data accessible by the
+ other guest.
+
+ Spectre variant 1 attacks are possible if parameters can be passed
+ between guests. This may be done via mechanisms such as shared memory
+ or message passing. Such parameters could be used to derive data
+ pointers to privileged data in guest. The privileged data could be
+ accessed by gadget code in the victim's speculation paths.
+
+ Spectre variant 2 attacks can be launched from a rogue guest by
+ :ref:`poisoning <poison_btb>` the branch target buffer or the return
+ stack buffer. Such poisoned entries could be used to influence
+ speculation execution paths in the victim guest.
+
+ Linux kernel mitigates attacks to other guests running in the same
+ CPU hardware thread by flushing the return stack buffer on VM exit,
+ and clearing the branch target buffer before switching to a new guest.
+
+ If SMT is used, Spectre variant 2 attacks from an untrusted guest
+ in the sibling hyperthread can be mitigated by the administrator,
+ by turning off the unsafe guest's indirect branch speculation via
+ prctl(). A guest can also protect itself by turning on microcode
+ based mitigations (such as IBPB or STIBP on x86) within the guest.
+
+.. _spectre_sys_info:
+
+Spectre system information
+--------------------------
+
+The Linux kernel provides a sysfs interface to enumerate the current
+mitigation status of the system for Spectre: whether the system is
+vulnerable, and which mitigations are active.
+
+The sysfs file showing Spectre variant 1 mitigation status is:
+
+ /sys/devices/system/cpu/vulnerabilities/spectre_v1
+
+The possible values in this file are:
+
+ ======================================= =================================
+ 'Mitigation: __user pointer sanitation' Protection in kernel on a case by
+ case base with explicit pointer
+ sanitation.
+ ======================================= =================================
+
+However, the protections are put in place on a case by case basis,
+and there is no guarantee that all possible attack vectors for Spectre
+variant 1 are covered.
+
+The spectre_v2 kernel file reports if the kernel has been compiled with
+retpoline mitigation or if the CPU has hardware mitigation, and if the
+CPU has support for additional process-specific mitigation.
+
+This file also reports CPU features enabled by microcode to mitigate
+attack between user processes:
+
+1. Indirect Branch Prediction Barrier (IBPB) to add additional
+ isolation between processes of different users.
+2. Single Thread Indirect Branch Predictors (STIBP) to add additional
+ isolation between CPU threads running on the same core.
+
+These CPU features may impact performance when used and can be enabled
+per process on a case-by-case base.
+
+The sysfs file showing Spectre variant 2 mitigation status is:
+
+ /sys/devices/system/cpu/vulnerabilities/spectre_v2
+
+The possible values in this file are:
+
+ - Kernel status:
+
+ ==================================== =================================
+ 'Not affected' The processor is not vulnerable
+ 'Vulnerable' Vulnerable, no mitigation
+ 'Mitigation: Full generic retpoline' Software-focused mitigation
+ 'Mitigation: Full AMD retpoline' AMD-specific software mitigation
+ 'Mitigation: Enhanced IBRS' Hardware-focused mitigation
+ ==================================== =================================
+
+ - Firmware status: Show if Indirect Branch Restricted Speculation (IBRS) is
+ used to protect against Spectre variant 2 attacks when calling firmware (x86 only).
+
+ ========== =============================================================
+ 'IBRS_FW' Protection against user program attacks when calling firmware
+ ========== =============================================================
+
+ - Indirect branch prediction barrier (IBPB) status for protection between
+ processes of different users. This feature can be controlled through
+ prctl() per process, or through kernel command line options. This is
+ an x86 only feature. For more details see below.
+
+ =================== ========================================================
+ 'IBPB: disabled' IBPB unused
+ 'IBPB: always-on' Use IBPB on all tasks
+ 'IBPB: conditional' Use IBPB on SECCOMP or indirect branch restricted tasks
+ =================== ========================================================
+
+ - Single threaded indirect branch prediction (STIBP) status for protection
+ between different hyper threads. This feature can be controlled through
+ prctl per process, or through kernel command line options. This is x86
+ only feature. For more details see below.
+
+ ==================== ========================================================
+ 'STIBP: disabled' STIBP unused
+ 'STIBP: forced' Use STIBP on all tasks
+ 'STIBP: conditional' Use STIBP on SECCOMP or indirect branch restricted tasks
+ ==================== ========================================================
+
+ - Return stack buffer (RSB) protection status:
+
+ ============= ===========================================
+ 'RSB filling' Protection of RSB on context switch enabled
+ ============= ===========================================
+
+Full mitigation might require a microcode update from the CPU
+vendor. When the necessary microcode is not available, the kernel will
+report vulnerability.
+
+Turning on mitigation for Spectre variant 1 and Spectre variant 2
+-----------------------------------------------------------------
+
+1. Kernel mitigation
+^^^^^^^^^^^^^^^^^^^^
+
+ For the Spectre variant 1, vulnerable kernel code (as determined
+ by code audit or scanning tools) is annotated on a case by case
+ basis to use nospec accessor macros for bounds clipping :ref:`[2]
+ <spec_ref2>` to avoid any usable disclosure gadgets. However, it may
+ not cover all attack vectors for Spectre variant 1.
+
+ For Spectre variant 2 mitigation, the compiler turns indirect calls or
+ jumps in the kernel into equivalent return trampolines (retpolines)
+ :ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` to go to the target
+ addresses. Speculative execution paths under retpolines are trapped
+ in an infinite loop to prevent any speculative execution jumping to
+ a gadget.
+
+ To turn on retpoline mitigation on a vulnerable CPU, the kernel
+ needs to be compiled with a gcc compiler that supports the
+ -mindirect-branch=thunk-extern -mindirect-branch-register options.
+ If the kernel is compiled with a Clang compiler, the compiler needs
+ to support -mretpoline-external-thunk option. The kernel config
+ CONFIG_RETPOLINE needs to be turned on, and the CPU needs to run with
+ the latest updated microcode.
+
+ On Intel Skylake-era systems the mitigation covers most, but not all,
+ cases. See :ref:`[3] <spec_ref3>` for more details.
+
+ On CPUs with hardware mitigation for Spectre variant 2 (e.g. Enhanced
+ IBRS on x86), retpoline is automatically disabled at run time.
+
+ The retpoline mitigation is turned on by default on vulnerable
+ CPUs. It can be forced on or off by the administrator
+ via the kernel command line and sysfs control files. See
+ :ref:`spectre_mitigation_control_command_line`.
+
+ On x86, indirect branch restricted speculation is turned on by default
+ before invoking any firmware code to prevent Spectre variant 2 exploits
+ using the firmware.
+
+ Using kernel address space randomization (CONFIG_RANDOMIZE_SLAB=y
+ and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) makes
+ attacks on the kernel generally more difficult.
+
+2. User program mitigation
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ User programs can mitigate Spectre variant 1 using LFENCE or "bounds
+ clipping". For more details see :ref:`[2] <spec_ref2>`.
+
+ For Spectre variant 2 mitigation, individual user programs
+ can be compiled with return trampolines for indirect branches.
+ This protects them from consuming poisoned entries in the branch
+ target buffer left by malicious software. Alternatively, the
+ programs can disable their indirect branch speculation via prctl()
+ (See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`).
+ On x86, this will turn on STIBP to guard against attacks from the
+ sibling thread when the user program is running, and use IBPB to
+ flush the branch target buffer when switching to/from the program.
+
+ Restricting indirect branch speculation on a user program will
+ also prevent the program from launching a variant 2 attack
+ on x86. All sand-boxed SECCOMP programs have indirect branch
+ speculation restricted by default. Administrators can change
+ that behavior via the kernel command line and sysfs control files.
+ See :ref:`spectre_mitigation_control_command_line`.
+
+ Programs that disable their indirect branch speculation will have
+ more overhead and run slower.
+
+ User programs should use address space randomization
+ (/proc/sys/kernel/randomize_va_space = 1 or 2) to make attacks more
+ difficult.
+
+3. VM mitigation
+^^^^^^^^^^^^^^^^
+
+ Within the kernel, Spectre variant 1 attacks from rogue guests are
+ mitigated on a case by case basis in VM exit paths. Vulnerable code
+ uses nospec accessor macros for "bounds clipping", to avoid any
+ usable disclosure gadgets. However, this may not cover all variant
+ 1 attack vectors.
+
+ For Spectre variant 2 attacks from rogue guests to the kernel, the
+ Linux kernel uses retpoline or Enhanced IBRS to prevent consumption of
+ poisoned entries in branch target buffer left by rogue guests. It also
+ flushes the return stack buffer on every VM exit to prevent a return
+ stack buffer underflow so poisoned branch target buffer could be used,
+ or attacker guests leaving poisoned entries in the return stack buffer.
+
+ To mitigate guest-to-guest attacks in the same CPU hardware thread,
+ the branch target buffer is sanitized by flushing before switching
+ to a new guest on a CPU.
+
+ The above mitigations are turned on by default on vulnerable CPUs.
+
+ To mitigate guest-to-guest attacks from sibling thread when SMT is
+ in use, an untrusted guest running in the sibling thread can have
+ its indirect branch speculation disabled by administrator via prctl().
+
+ The kernel also allows guests to use any microcode based mitigation
+ they choose to use (such as IBPB or STIBP on x86) to protect themselves.
+
+.. _spectre_mitigation_control_command_line:
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+Spectre variant 2 mitigation can be disabled or force enabled at the
+kernel command line.
+
+ nospectre_v2
+
+ [X86] Disable all mitigations for the Spectre variant 2
+ (indirect branch prediction) vulnerability. System may
+ allow data leaks with this option, which is equivalent
+ to spectre_v2=off.
+
+
+ spectre_v2=
+
+ [X86] Control mitigation of Spectre variant 2
+ (indirect branch speculation) vulnerability.
+ The default operation protects the kernel from
+ user space attacks.
+
+ on
+ unconditionally enable, implies
+ spectre_v2_user=on
+ off
+ unconditionally disable, implies
+ spectre_v2_user=off
+ auto
+ kernel detects whether your CPU model is
+ vulnerable
+
+ Selecting 'on' will, and 'auto' may, choose a
+ mitigation method at run time according to the
+ CPU, the available microcode, the setting of the
+ CONFIG_RETPOLINE configuration option, and the
+ compiler with which the kernel was built.
+
+ Selecting 'on' will also enable the mitigation
+ against user space to user space task attacks.
+
+ Selecting 'off' will disable both the kernel and
+ the user space protections.
+
+ Specific mitigations can also be selected manually:
+
+ retpoline
+ replace indirect branches
+ retpoline,generic
+ google's original retpoline
+ retpoline,amd
+ AMD-specific minimal thunk
+
+ Not specifying this option is equivalent to
+ spectre_v2=auto.
+
+For user space mitigation:
+
+ spectre_v2_user=
+
+ [X86] Control mitigation of Spectre variant 2
+ (indirect branch speculation) vulnerability between
+ user space tasks
+
+ on
+ Unconditionally enable mitigations. Is
+ enforced by spectre_v2=on
+
+ off
+ Unconditionally disable mitigations. Is
+ enforced by spectre_v2=off
+
+ prctl
+ Indirect branch speculation is enabled,
+ but mitigation can be enabled via prctl
+ per thread. The mitigation control state
+ is inherited on fork.
+
+ prctl,ibpb
+ Like "prctl" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different user
+ space processes.
+
+ seccomp
+ Same as "prctl" above, but all seccomp
+ threads will enable the mitigation unless
+ they explicitly opt out.
+
+ seccomp,ibpb
+ Like "seccomp" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different
+ user space processes.
+
+ auto
+ Kernel selects the mitigation depending on
+ the available CPU features and vulnerability.
+
+ Default mitigation:
+ If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
+
+ Not specifying this option is equivalent to
+ spectre_v2_user=auto.
+
+ In general the kernel by default selects
+ reasonable mitigations for the current CPU. To
+ disable Spectre variant 2 mitigations, boot with
+ spectre_v2=off. Spectre variant 1 mitigations
+ cannot be disabled.
+
+Mitigation selection guide
+--------------------------
+
+1. Trusted userspace
+^^^^^^^^^^^^^^^^^^^^
+
+ If all userspace applications are from trusted sources and do not
+ execute externally supplied untrusted code, then the mitigations can
+ be disabled.
+
+2. Protect sensitive programs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ For security-sensitive programs that have secrets (e.g. crypto
+ keys), protection against Spectre variant 2 can be put in place by
+ disabling indirect branch speculation when the program is running
+ (See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`).
+
+3. Sandbox untrusted programs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ Untrusted programs that could be a source of attacks can be cordoned
+ off by disabling their indirect branch speculation when they are run
+ (See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`).
+ This prevents untrusted programs from polluting the branch target
+ buffer. All programs running in SECCOMP sandboxes have indirect
+ branch speculation restricted by default. This behavior can be
+ changed via the kernel command line and sysfs control files. See
+ :ref:`spectre_mitigation_control_command_line`.
+
+3. High security mode
+^^^^^^^^^^^^^^^^^^^^^
+
+ All Spectre variant 2 mitigations can be forced on
+ at boot time for all programs (See the "on" option in
+ :ref:`spectre_mitigation_control_command_line`). This will add
+ overhead as indirect branch speculations for all programs will be
+ restricted.
+
+ On x86, branch target buffer will be flushed with IBPB when switching
+ to a new program. STIBP is left on all the time to protect programs
+ against variant 2 attacks originating from programs running on
+ sibling threads.
+
+ Alternatively, STIBP can be used only when running programs
+ whose indirect branch speculation is explicitly disabled,
+ while IBPB is still used all the time when switching to a new
+ program to clear the branch target buffer (See "ibpb" option in
+ :ref:`spectre_mitigation_control_command_line`). This "ibpb" option
+ has less performance cost than the "on" option, which leaves STIBP
+ on all the time.
+
+References on Spectre
+---------------------
+
+Intel white papers:
+
+.. _spec_ref1:
+
+[1] `Intel analysis of speculative execution side channels <https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/Intel-Analysis-of-Speculative-Execution-Side-Channels.pdf>`_.
+
+.. _spec_ref2:
+
+[2] `Bounds check bypass <https://software.intel.com/security-software-guidance/software-guidance/bounds-check-bypass>`_.
+
+.. _spec_ref3:
+
+[3] `Deep dive: Retpoline: A branch target injection mitigation <https://software.intel.com/security-software-guidance/insights/deep-dive-retpoline-branch-target-injection-mitigation>`_.
+
+.. _spec_ref4:
+
+[4] `Deep Dive: Single Thread Indirect Branch Predictors <https://software.intel.com/security-software-guidance/insights/deep-dive-single-thread-indirect-branch-predictors>`_.
+
+AMD white papers:
+
+.. _spec_ref5:
+
+[5] `AMD64 technology indirect branch control extension <https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf>`_.
+
+.. _spec_ref6:
+
+[6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/90343-B_SoftwareTechniquesforManagingSpeculation_WP_7-18Update_FNL.pdf>`_.
+
+ARM white papers:
+
+.. _spec_ref7:
+
+[7] `Cache speculation side-channels <https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/download-the-whitepaper>`_.
+
+.. _spec_ref8:
+
+[8] `Cache speculation issues update <https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/latest-updates/cache-speculation-issues-update>`_.
+
+Google white paper:
+
+.. _spec_ref9:
+
+[9] `Retpoline: a software construct for preventing branch-target-injection <https://support.google.com/faqs/answer/7625886>`_.
+
+MIPS white paper:
+
+.. _spec_ref10:
+
+[10] `MIPS: response on speculative execution and side channel vulnerabilities <https://www.mips.com/blog/mips-response-on-speculative-execution-and-side-channel-vulnerabilities/>`_.
+
+Academic papers:
+
+.. _spec_ref11:
+
+[11] `Spectre Attacks: Exploiting Speculative Execution <https://spectreattack.com/spectre.pdf>`_.
+
+.. _spec_ref12:
+
+[12] `NetSpectre: Read Arbitrary Memory over Network <https://arxiv.org/abs/1807.10535>`_.
+
+.. _spec_ref13:
+
+[13] `Spectre Returns! Speculation Attacks using the Return Stack Buffer <https://www.usenix.org/system/files/conference/woot18/woot18-paper-koruyeh.pdf>`_.
diff --git a/Documentation/userspace-api/spec_ctrl.rst b/Documentation/userspace-api/spec_ctrl.rst
index c4dbe6f7cdae..0fda8f614110 100644
--- a/Documentation/userspace-api/spec_ctrl.rst
+++ b/Documentation/userspace-api/spec_ctrl.rst
@@ -47,6 +47,8 @@ If PR_SPEC_PRCTL is set, then the per-task control of the mitigation is
available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation
misfeature will fail.
+.. _set_spec_ctrl:
+
PR_SET_SPECULATION_CTRL
-----------------------
--
2.20.1

View File

@ -1,170 +0,0 @@
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Sat, 3 Aug 2019 21:21:54 +0200
Subject: Documentation: Add swapgs description to the Spectre v1 documentation
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7634b9cd27e8f867dd3438d262c78d4b9262497f
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-1125
commit 4c92057661a3412f547ede95715641d7ee16ddac upstream
Add documentation to the Spectre document about the new swapgs variant of
Spectre v1.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
Documentation/admin-guide/hw-vuln/spectre.rst | 88 +++++++++++++++++--
1 file changed, 80 insertions(+), 8 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
index 25f3b2532198..e05e581af5cf 100644
--- a/Documentation/admin-guide/hw-vuln/spectre.rst
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -41,10 +41,11 @@ Related CVEs
The following CVE entries describe Spectre variants:
- ============= ======================= =================
+ ============= ======================= ==========================
CVE-2017-5753 Bounds check bypass Spectre variant 1
CVE-2017-5715 Branch target injection Spectre variant 2
- ============= ======================= =================
+ CVE-2019-1125 Spectre v1 swapgs Spectre variant 1 (swapgs)
+ ============= ======================= ==========================
Problem
-------
@@ -78,6 +79,13 @@ There are some extensions of Spectre variant 1 attacks for reading data
over the network, see :ref:`[12] <spec_ref12>`. However such attacks
are difficult, low bandwidth, fragile, and are considered low risk.
+Note that, despite "Bounds Check Bypass" name, Spectre variant 1 is not
+only about user-controlled array bounds checks. It can affect any
+conditional checks. The kernel entry code interrupt, exception, and NMI
+handlers all have conditional swapgs checks. Those may be problematic
+in the context of Spectre v1, as kernel code can speculatively run with
+a user GS.
+
Spectre variant 2 (Branch Target Injection)
-------------------------------------------
@@ -132,6 +140,9 @@ not cover all possible attack vectors.
1. A user process attacking the kernel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Spectre variant 1
+~~~~~~~~~~~~~~~~~
+
The attacker passes a parameter to the kernel via a register or
via a known address in memory during a syscall. Such parameter may
be used later by the kernel as an index to an array or to derive
@@ -144,7 +155,40 @@ not cover all possible attack vectors.
potentially be influenced for Spectre attacks, new "nospec" accessor
macros are used to prevent speculative loading of data.
- Spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
+Spectre variant 1 (swapgs)
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ An attacker can train the branch predictor to speculatively skip the
+ swapgs path for an interrupt or exception. If they initialize
+ the GS register to a user-space value, if the swapgs is speculatively
+ skipped, subsequent GS-related percpu accesses in the speculation
+ window will be done with the attacker-controlled GS value. This
+ could cause privileged memory to be accessed and leaked.
+
+ For example:
+
+ ::
+
+ if (coming from user space)
+ swapgs
+ mov %gs:<percpu_offset>, %reg
+ mov (%reg), %reg1
+
+ When coming from user space, the CPU can speculatively skip the
+ swapgs, and then do a speculative percpu load using the user GS
+ value. So the user can speculatively force a read of any kernel
+ value. If a gadget exists which uses the percpu value as an address
+ in another load/store, then the contents of the kernel value may
+ become visible via an L1 side channel attack.
+
+ A similar attack exists when coming from kernel space. The CPU can
+ speculatively do the swapgs, causing the user GS to get used for the
+ rest of the speculative window.
+
+Spectre variant 2
+~~~~~~~~~~~~~~~~~
+
+ A spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
target buffer (BTB) before issuing syscall to launch an attack.
After entering the kernel, the kernel could use the poisoned branch
target buffer on indirect jump and jump to gadget code in speculative
@@ -280,11 +324,18 @@ The sysfs file showing Spectre variant 1 mitigation status is:
The possible values in this file are:
- ======================================= =================================
- 'Mitigation: __user pointer sanitation' Protection in kernel on a case by
- case base with explicit pointer
- sanitation.
- ======================================= =================================
+ .. list-table::
+
+ * - 'Not affected'
+ - The processor is not vulnerable.
+ * - 'Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers'
+ - The swapgs protections are disabled; otherwise it has
+ protection in the kernel on a case by case base with explicit
+ pointer sanitation and usercopy LFENCE barriers.
+ * - 'Mitigation: usercopy/swapgs barriers and __user pointer sanitization'
+ - Protection in the kernel on a case by case base with explicit
+ pointer sanitation, usercopy LFENCE barriers, and swapgs LFENCE
+ barriers.
However, the protections are put in place on a case by case basis,
and there is no guarantee that all possible attack vectors for Spectre
@@ -366,12 +417,27 @@ Turning on mitigation for Spectre variant 1 and Spectre variant 2
1. Kernel mitigation
^^^^^^^^^^^^^^^^^^^^
+Spectre variant 1
+~~~~~~~~~~~~~~~~~
+
For the Spectre variant 1, vulnerable kernel code (as determined
by code audit or scanning tools) is annotated on a case by case
basis to use nospec accessor macros for bounds clipping :ref:`[2]
<spec_ref2>` to avoid any usable disclosure gadgets. However, it may
not cover all attack vectors for Spectre variant 1.
+ Copy-from-user code has an LFENCE barrier to prevent the access_ok()
+ check from being mis-speculated. The barrier is done by the
+ barrier_nospec() macro.
+
+ For the swapgs variant of Spectre variant 1, LFENCE barriers are
+ added to interrupt, exception and NMI entry where needed. These
+ barriers are done by the FENCE_SWAPGS_KERNEL_ENTRY and
+ FENCE_SWAPGS_USER_ENTRY macros.
+
+Spectre variant 2
+~~~~~~~~~~~~~~~~~
+
For Spectre variant 2 mitigation, the compiler turns indirect calls or
jumps in the kernel into equivalent return trampolines (retpolines)
:ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` to go to the target
@@ -473,6 +539,12 @@ Mitigation control on the kernel command line
Spectre variant 2 mitigation can be disabled or force enabled at the
kernel command line.
+ nospectre_v1
+
+ [X86,PPC] Disable mitigations for Spectre Variant 1
+ (bounds check bypass). With this option data leaks are
+ possible in the system.
+
nospectre_v2
[X86] Disable all mitigations for the Spectre variant 2
--
2.20.1

View File

@ -1,67 +0,0 @@
From: Todd Kjos <tkjos@android.com>
Date: Fri, 1 Mar 2019 15:06:06 -0800
Subject: binder: fix race between munmap() and direct reclaim
Origin: https://git.kernel.org/linus/5cec2d2e5839f9c0fec319c523a911e0a7fd299f
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-1999
An munmap() on a binder device causes binder_vma_close() to be called
which clears the alloc->vma pointer.
If direct reclaim causes binder_alloc_free_page() to be called, there
is a race where alloc->vma is read into a local vma pointer and then
used later after the mm->mmap_sem is acquired. This can result in
calling zap_page_range() with an invalid vma which manifests as a
use-after-free in zap_page_range().
The fix is to check alloc->vma after acquiring the mmap_sem (which we
were acquiring anyway) and skip zap_page_range() if it has changed
to NULL.
Signed-off-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/android/binder_alloc.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 030c98f35cca..3863ef78e40f 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -958,14 +958,13 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
index = page - alloc->pages;
page_addr = (uintptr_t)alloc->buffer + index * PAGE_SIZE;
+
+ mm = alloc->vma_vm_mm;
+ if (!mmget_not_zero(mm))
+ goto err_mmget;
+ if (!down_write_trylock(&mm->mmap_sem))
+ goto err_down_write_mmap_sem_failed;
vma = binder_alloc_get_vma(alloc);
- if (vma) {
- if (!mmget_not_zero(alloc->vma_vm_mm))
- goto err_mmget;
- mm = alloc->vma_vm_mm;
- if (!down_write_trylock(&mm->mmap_sem))
- goto err_down_write_mmap_sem_failed;
- }
list_lru_isolate(lru, item);
spin_unlock(lock);
@@ -979,9 +978,9 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
trace_binder_unmap_user_end(alloc, index);
- up_write(&mm->mmap_sem);
- mmput(mm);
}
+ up_write(&mm->mmap_sem);
+ mmput(mm);
trace_binder_unmap_kernel_start(alloc, index);
--
2.20.1

View File

@ -1,106 +0,0 @@
From: Arend van Spriel <arend.vanspriel@broadcom.com>
Date: Thu, 14 Feb 2019 13:43:48 +0100
Subject: brcmfmac: add subtype check for event handling in data path
Origin: https://git.kernel.org/linus/a4176ec356c73a46c07c181c6d04039fafa34a9f
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-9503
For USB there is no separate channel being used to pass events
from firmware to the host driver and as such are passed over the
data path. In order to detect mock event messages an additional
check is needed on event subtype. This check is added conditionally
using unlikely() keyword.
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
---
.../wireless/broadcom/brcm80211/brcmfmac/core.c | 5 +++--
.../wireless/broadcom/brcm80211/brcmfmac/fweh.h | 16 ++++++++++++----
.../broadcom/brcm80211/brcmfmac/msgbuf.c | 2 +-
3 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
index e772c0845638..a368ba6e7344 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
@@ -519,7 +519,8 @@ void brcmf_rx_frame(struct device *dev, struct sk_buff *skb, bool handle_event)
} else {
/* Process special event packets */
if (handle_event)
- brcmf_fweh_process_skb(ifp->drvr, skb);
+ brcmf_fweh_process_skb(ifp->drvr, skb,
+ BCMILCP_SUBTYPE_VENDOR_LONG);
brcmf_netif_rx(ifp, skb);
}
@@ -536,7 +537,7 @@ void brcmf_rx_event(struct device *dev, struct sk_buff *skb)
if (brcmf_rx_hdrpull(drvr, skb, &ifp))
return;
- brcmf_fweh_process_skb(ifp->drvr, skb);
+ brcmf_fweh_process_skb(ifp->drvr, skb, 0);
brcmu_pkt_buf_free_skb(skb);
}
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h
index 31f3e8e83a21..7027243db17e 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h
@@ -211,7 +211,7 @@ enum brcmf_fweh_event_code {
*/
#define BRCM_OUI "\x00\x10\x18"
#define BCMILCP_BCM_SUBTYPE_EVENT 1
-
+#define BCMILCP_SUBTYPE_VENDOR_LONG 32769
/**
* struct brcm_ethhdr - broadcom specific ether header.
@@ -334,10 +334,10 @@ void brcmf_fweh_process_event(struct brcmf_pub *drvr,
void brcmf_fweh_p2pdev_setup(struct brcmf_if *ifp, bool ongoing);
static inline void brcmf_fweh_process_skb(struct brcmf_pub *drvr,
- struct sk_buff *skb)
+ struct sk_buff *skb, u16 stype)
{
struct brcmf_event *event_packet;
- u16 usr_stype;
+ u16 subtype, usr_stype;
/* only process events when protocol matches */
if (skb->protocol != cpu_to_be16(ETH_P_LINK_CTL))
@@ -346,8 +346,16 @@ static inline void brcmf_fweh_process_skb(struct brcmf_pub *drvr,
if ((skb->len + ETH_HLEN) < sizeof(*event_packet))
return;
- /* check for BRCM oui match */
event_packet = (struct brcmf_event *)skb_mac_header(skb);
+
+ /* check subtype if needed */
+ if (unlikely(stype)) {
+ subtype = get_unaligned_be16(&event_packet->hdr.subtype);
+ if (subtype != stype)
+ return;
+ }
+
+ /* check for BRCM oui match */
if (memcmp(BRCM_OUI, &event_packet->hdr.oui[0],
sizeof(event_packet->hdr.oui)))
return;
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c
index 4e8397a0cbc8..ee922b052561 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c
@@ -1116,7 +1116,7 @@ static void brcmf_msgbuf_process_event(struct brcmf_msgbuf *msgbuf, void *buf)
skb->protocol = eth_type_trans(skb, ifp->ndev);
- brcmf_fweh_process_skb(ifp->drvr, skb);
+ brcmf_fweh_process_skb(ifp->drvr, skb, 0);
exit:
brcmu_pkt_buf_free_skb(skb);
--
2.20.1

View File

@ -1,34 +0,0 @@
From: Arend van Spriel <arend.vanspriel@broadcom.com>
Date: Thu, 14 Feb 2019 13:43:47 +0100
Subject: brcmfmac: assure SSID length from firmware is limited
Origin: https://git.kernel.org/linus/1b5e2423164b3670e8bc9174e4762d297990deff
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-9500
The SSID length as received from firmware should not exceed
IEEE80211_MAX_SSID_LEN as that would result in heap overflow.
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
---
drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
index b5e291ed9496..012275fc3bf7 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
@@ -3507,6 +3507,8 @@ brcmf_wowl_nd_results(struct brcmf_if *ifp, const struct brcmf_event_msg *e,
}
netinfo = brcmf_get_netinfo_array(pfn_result);
+ if (netinfo->SSID_len > IEEE80211_MAX_SSID_LEN)
+ netinfo->SSID_len = IEEE80211_MAX_SSID_LEN;
memcpy(cfg->wowl.nd->ssid.ssid, netinfo->SSID, netinfo->SSID_len);
cfg->wowl.nd->ssid.ssid_len = netinfo->SSID_len;
cfg->wowl.nd->n_channels = 1;
--
2.20.1

View File

@ -1,82 +0,0 @@
From: Sriram Rajagopalan <sriramr@arista.com>
Date: Fri, 10 May 2019 19:28:06 -0400
Subject: ext4: zero out the unused memory region in the extent tree block
Origin: https://git.kernel.org/linus/592acbf16821288ecdc4192c47e3774a4c48bb64
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-11833
This commit zeroes out the unused memory region in the buffer_head
corresponding to the extent metablock after writing the extent header
and the corresponding extent node entries.
This is done to prevent random uninitialized data from getting into
the filesystem when the extent block is synced.
This fixes CVE-2019-11833.
Signed-off-by: Sriram Rajagopalan <sriramr@arista.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
---
fs/ext4/extents.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 0f89f5190cd7..f2c62e2a0c98 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -1035,6 +1035,7 @@ static int ext4_ext_split(handle_t *handle, struct inode *inode,
__le32 border;
ext4_fsblk_t *ablocks = NULL; /* array of allocated blocks */
int err = 0;
+ size_t ext_size = 0;
/* make decision: where to split? */
/* FIXME: now decision is simplest: at current extent */
@@ -1126,6 +1127,10 @@ static int ext4_ext_split(handle_t *handle, struct inode *inode,
le16_add_cpu(&neh->eh_entries, m);
}
+ /* zero out unused area in the extent block */
+ ext_size = sizeof(struct ext4_extent_header) +
+ sizeof(struct ext4_extent) * le16_to_cpu(neh->eh_entries);
+ memset(bh->b_data + ext_size, 0, inode->i_sb->s_blocksize - ext_size);
ext4_extent_block_csum_set(inode, neh);
set_buffer_uptodate(bh);
unlock_buffer(bh);
@@ -1205,6 +1210,11 @@ static int ext4_ext_split(handle_t *handle, struct inode *inode,
sizeof(struct ext4_extent_idx) * m);
le16_add_cpu(&neh->eh_entries, m);
}
+ /* zero out unused area in the extent block */
+ ext_size = sizeof(struct ext4_extent_header) +
+ (sizeof(struct ext4_extent) * le16_to_cpu(neh->eh_entries));
+ memset(bh->b_data + ext_size, 0,
+ inode->i_sb->s_blocksize - ext_size);
ext4_extent_block_csum_set(inode, neh);
set_buffer_uptodate(bh);
unlock_buffer(bh);
@@ -1270,6 +1280,7 @@ static int ext4_ext_grow_indepth(handle_t *handle, struct inode *inode,
ext4_fsblk_t newblock, goal = 0;
struct ext4_super_block *es = EXT4_SB(inode->i_sb)->s_es;
int err = 0;
+ size_t ext_size = 0;
/* Try to prepend new index to old one */
if (ext_depth(inode))
@@ -1295,9 +1306,11 @@ static int ext4_ext_grow_indepth(handle_t *handle, struct inode *inode,
goto out;
}
+ ext_size = sizeof(EXT4_I(inode)->i_data);
/* move top-level index/leaf into new block */
- memmove(bh->b_data, EXT4_I(inode)->i_data,
- sizeof(EXT4_I(inode)->i_data));
+ memmove(bh->b_data, EXT4_I(inode)->i_data, ext_size);
+ /* zero out unused area in the extent block */
+ memset(bh->b_data + ext_size, 0, inode->i_sb->s_blocksize - ext_size);
/* set size of new block */
neh = ext_block_hdr(bh);
--
2.20.1

View File

@ -25,7 +25,7 @@ upstream submission.
if (head->magic != 0x4e657458) {
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -747,10 +747,8 @@ static enum ucode_state request_microcod
@@ -755,10 +755,8 @@ static enum ucode_state request_microcod
if (c->x86 >= 0x15)
snprintf(fw_name, sizeof(fw_name), "amd-ucode/microcode_amd_fam%.2xh.bin", c->x86);
@ -175,7 +175,7 @@ upstream submission.
fw->size, fw_name);
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -1475,11 +1475,8 @@ static void sdma_load_firmware(const str
@@ -1674,11 +1674,8 @@ static void sdma_load_firmware(const str
const struct sdma_script_start_addrs *addr;
unsigned short *ram_code;
@ -285,7 +285,7 @@ upstream submission.
ret = qib_ibsd_ucode_loaded(dd->pport, fw);
--- a/drivers/input/touchscreen/atmel_mxt_ts.c
+++ b/drivers/input/touchscreen/atmel_mxt_ts.c
@@ -2760,10 +2760,8 @@ static int mxt_load_fw(struct device *de
@@ -2783,10 +2783,8 @@ static int mxt_load_fw(struct device *de
int ret;
ret = request_firmware(&fw, fn, dev);
@ -314,7 +314,7 @@ upstream submission.
card->name, firmware->size);
--- a/drivers/media/tuners/tuner-xc2028.c
+++ b/drivers/media/tuners/tuner-xc2028.c
@@ -1368,7 +1368,6 @@ static void load_firmware_cb(const struc
@@ -1367,7 +1367,6 @@ static void load_firmware_cb(const struc
tuner_dbg("request_firmware_nowait(): %s\n", fw ? "OK" : "error");
if (!fw) {
@ -324,7 +324,7 @@ upstream submission.
}
--- a/drivers/media/usb/dvb-usb/dib0700_devices.c
+++ b/drivers/media/usb/dvb-usb/dib0700_devices.c
@@ -2415,12 +2415,9 @@ static int stk9090m_frontend_attach(stru
@@ -2416,12 +2416,9 @@ static int stk9090m_frontend_attach(stru
dib9000_i2c_enumeration(&adap->dev->i2c_adap, 1, 0x10, 0x80);
@ -339,7 +339,7 @@ upstream submission.
stk9090m_config.microcode_B_fe_size = state->frontend_firmware->size;
stk9090m_config.microcode_B_fe_buffer = state->frontend_firmware->data;
@@ -2481,12 +2478,9 @@ static int nim9090md_frontend_attach(str
@@ -2482,12 +2479,9 @@ static int nim9090md_frontend_attach(str
msleep(20);
dib0700_set_gpio(adap->dev, GPIO0, GPIO_OUT, 1);
@ -472,7 +472,7 @@ upstream submission.
if (!state->microcode) {
--- a/drivers/media/dvb-frontends/drxk_hard.c
+++ b/drivers/media/dvb-frontends/drxk_hard.c
@@ -6287,10 +6287,6 @@ static void load_firmware_cb(const struc
@@ -6281,10 +6281,6 @@ static void load_firmware_cb(const struc
dprintk(1, ": %s\n", fw ? "firmware loaded" : "firmware not loaded");
if (!fw) {
@ -751,7 +751,7 @@ upstream submission.
packet_num = ptr[0];
--- a/drivers/media/radio/wl128x/fmdrv_common.c
+++ b/drivers/media/radio/wl128x/fmdrv_common.c
@@ -1242,10 +1242,8 @@ static int fm_download_firmware(struct f
@@ -1245,10 +1245,8 @@ static int fm_download_firmware(struct f
ret = request_firmware(&fw_entry, fw_name,
&fmdev->radio_dev->dev);
@ -944,7 +944,7 @@ upstream submission.
--- a/drivers/media/usb/pvrusb2/pvrusb2-hdw.c
+++ b/drivers/media/usb/pvrusb2/pvrusb2-hdw.c
@@ -1377,25 +1377,6 @@ static int pvr2_locate_firmware(struct p
@@ -1379,25 +1379,6 @@ static int pvr2_locate_firmware(struct p
"request_firmware fatal error with code=%d",ret);
return ret;
}
@ -1015,7 +1015,7 @@ upstream submission.
__func__, fw->size);
--- a/drivers/misc/ti-st/st_kim.c
+++ b/drivers/misc/ti-st/st_kim.c
@@ -302,11 +302,8 @@ static long download_firmware(struct kim
@@ -301,11 +301,8 @@ static long download_firmware(struct kim
request_firmware(&kim_gdata->fw_entry, bts_scr_name,
&kim_gdata->kim_pdev->dev);
if (unlikely((err != 0) || (kim_gdata->fw_entry->data == NULL) ||
@ -1088,7 +1088,7 @@ upstream submission.
fw_tx->size, FIRMWARE_TX);
--- a/drivers/net/ethernet/alteon/acenic.c
+++ b/drivers/net/ethernet/alteon/acenic.c
@@ -2892,11 +2892,8 @@ static int ace_load_firmware(struct net_
@@ -2890,11 +2890,8 @@ static int ace_load_firmware(struct net_
fw_name = "acenic/tg1.bin";
ret = request_firmware(&fw, fw_name, &ap->pdev->dev);
@ -1125,7 +1125,7 @@ upstream submission.
if (bp->mips_firmware->size < sizeof(*mips_fw) ||
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -13524,11 +13524,8 @@ static int bnx2x_init_firmware(struct bn
@@ -13550,11 +13550,8 @@ static int bnx2x_init_firmware(struct bn
BNX2X_DEV_INFO("Loading %s\n", fw_file_name);
rc = request_firmware(&bp->firmware, fw_file_name, &bp->pdev->dev);
@ -1140,7 +1140,7 @@ upstream submission.
if (rc) {
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -11380,11 +11380,8 @@ static int tg3_request_firmware(struct t
@@ -11408,11 +11408,8 @@ static int tg3_request_firmware(struct t
{
const struct tg3_firmware_hdr *fw_hdr;
@ -1169,7 +1169,7 @@ upstream submission.
*bfi_image_size = fw->size/sizeof(u32);
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
@@ -1037,12 +1037,8 @@ int t3_get_edc_fw(struct cphy *phy, int
@@ -1038,12 +1038,8 @@ int t3_get_edc_fw(struct cphy *phy, int
fw_name = get_edc_fw_name(edc_idx);
if (fw_name)
ret = request_firmware(&fw, fw_name, &adapter->pdev->dev);
@ -1183,7 +1183,7 @@ upstream submission.
/* check size, take checksum in account */
if (fw->size > size + 4) {
@@ -1079,11 +1075,8 @@ static int upgrade_fw(struct adapter *ad
@@ -1080,11 +1076,8 @@ static int upgrade_fw(struct adapter *ad
struct device *dev = &adap->pdev->dev;
ret = request_firmware(&fw, FW_FNAME, dev);
@ -1196,7 +1196,7 @@ upstream submission.
ret = t3_load_fw(adap, fw->data, fw->size);
release_firmware(fw);
@@ -1128,11 +1121,8 @@ static int update_tpsram(struct adapter
@@ -1129,11 +1122,8 @@ static int update_tpsram(struct adapter
snprintf(buf, sizeof(buf), TPSRAM_NAME, rev);
ret = request_firmware(&tpsram, buf, dev);
@ -1248,7 +1248,7 @@ upstream submission.
for (i = 0; i < fw->size; i++) {
--- a/drivers/net/ethernet/sun/cassini.c
+++ b/drivers/net/ethernet/sun/cassini.c
@@ -818,11 +818,8 @@ static void cas_saturn_firmware_init(str
@@ -805,11 +805,8 @@ static void cas_saturn_firmware_init(str
return;
err = request_firmware(&fw, fw_name, &cp->pdev->dev);
@ -1292,7 +1292,7 @@ upstream submission.
dev_err(&kaweth->intf->dev, "Firmware too big: %zu\n",
--- a/drivers/net/wimax/i2400m/fw.c
+++ b/drivers/net/wimax/i2400m/fw.c
@@ -1582,11 +1582,8 @@ int i2400m_dev_bootstrap(struct i2400m *
@@ -1581,11 +1581,8 @@ int i2400m_dev_bootstrap(struct i2400m *
}
d_printf(1, dev, "trying firmware %s (%d)\n", fw_name, itr);
ret = request_firmware(&fw, fw_name, dev);
@ -1305,7 +1305,7 @@ upstream submission.
i2400m->fw_name = fw_name;
ret = i2400m_fw_bootstrap(i2400m, fw, flags);
release_firmware(fw);
@@ -1629,8 +1626,6 @@ void i2400m_fw_cache(struct i2400m *i240
@@ -1628,8 +1625,6 @@ void i2400m_fw_cache(struct i2400m *i240
kref_init(&i2400m_fw->kref);
result = request_firmware(&i2400m_fw->fw, i2400m->fw_name, dev);
if (result < 0) {
@ -1333,7 +1333,7 @@ upstream submission.
fwh = (struct at76_fw_header *)(fwe->fw->data);
--- a/drivers/net/wireless/ath/ath9k/hif_usb.c
+++ b/drivers/net/wireless/ath/ath9k/hif_usb.c
@@ -1163,9 +1163,6 @@ static void ath9k_hif_usb_firmware_cb(co
@@ -1164,9 +1164,6 @@ static void ath9k_hif_usb_firmware_cb(co
if (!ret)
return;
@ -1345,7 +1345,7 @@ upstream submission.
--- a/drivers/net/wireless/ath/carl9170/usb.c
+++ b/drivers/net/wireless/ath/carl9170/usb.c
@@ -1031,7 +1031,6 @@ static void carl9170_usb_firmware_step2(
@@ -1029,7 +1029,6 @@ static void carl9170_usb_firmware_step2(
return;
}
@ -1355,7 +1355,7 @@ upstream submission.
--- a/drivers/net/wireless/atmel/atmel.c
+++ b/drivers/net/wireless/atmel/atmel.c
@@ -3897,12 +3897,8 @@ static int reset_atmel_card(struct net_d
@@ -3893,12 +3893,8 @@ static int reset_atmel_card(struct net_d
strcpy(priv->firmware_id, "atmel_at76c502.bin");
}
err = request_firmware(&fw_entry, priv->firmware_id, priv->sys_dev);
@ -1433,7 +1433,7 @@ upstream submission.
}
--- a/drivers/net/wireless/intel/ipw2x00/ipw2100.c
+++ b/drivers/net/wireless/intel/ipw2x00/ipw2100.c
@@ -8417,12 +8417,8 @@ static int ipw2100_get_firmware(struct i
@@ -8410,12 +8410,8 @@ static int ipw2100_get_firmware(struct i
rc = request_firmware(&fw->fw_entry, fw_name, &priv->pci_dev->dev);
@ -1463,7 +1463,7 @@ upstream submission.
IPW_ERROR("%s is too small (%zd)\n", name, (*raw)->size);
--- a/drivers/net/wireless/intel/iwlegacy/3945-mac.c
+++ b/drivers/net/wireless/intel/iwlegacy/3945-mac.c
@@ -1861,7 +1861,6 @@ il3945_read_ucode(struct il_priv *il)
@@ -1854,7 +1854,6 @@ il3945_read_ucode(struct il_priv *il)
sprintf(buf, "%s%u%s", name_pre, idx, ".ucode");
ret = request_firmware(&ucode_raw, buf, &il->pci_dev->dev);
if (ret < 0) {
@ -1484,7 +1484,7 @@ upstream submission.
cfg->ucode_api_max);
--- a/drivers/net/wireless/marvell/libertas_tf/if_usb.c
+++ b/drivers/net/wireless/marvell/libertas_tf/if_usb.c
@@ -818,8 +818,6 @@ static int if_usb_prog_firmware(struct i
@@ -817,8 +817,6 @@ static int if_usb_prog_firmware(struct i
kernel_param_lock(THIS_MODULE);
ret = request_firmware(&cardp->fw, lbtf_fw_name, &cardp->udev->dev);
if (ret < 0) {
@ -1495,7 +1495,7 @@ upstream submission.
}
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -525,11 +525,8 @@ static int _mwifiex_fw_dpc(const struct
@@ -528,11 +528,8 @@ static int _mwifiex_fw_dpc(const struct
struct wireless_dev *wdev;
struct completion *fw_done = adapter->fw_done;
@ -1510,7 +1510,7 @@ upstream submission.
adapter->firmware = firmware;
--- a/drivers/net/wireless/marvell/mwl8k.c
+++ b/drivers/net/wireless/marvell/mwl8k.c
@@ -5719,16 +5719,12 @@ static int mwl8k_firmware_load_success(s
@@ -5724,16 +5724,12 @@ static int mwl8k_firmware_load_success(s
static void mwl8k_fw_state_machine(const struct firmware *fw, void *context)
{
struct mwl8k_priv *priv = context;
@ -1528,7 +1528,7 @@ upstream submission.
priv->fw_helper = fw;
rc = mwl8k_request_fw(priv, priv->fw_pref, &priv->fw_ucode,
true);
@@ -5763,11 +5759,8 @@ static void mwl8k_fw_state_machine(const
@@ -5768,11 +5764,8 @@ static void mwl8k_fw_state_machine(const
break;
case FW_STATE_LOADING_ALT:
@ -1541,7 +1541,7 @@ upstream submission.
priv->fw_ucode = fw;
rc = mwl8k_firmware_load_success(priv);
if (rc)
@@ -5805,10 +5798,8 @@ retry:
@@ -5810,10 +5803,8 @@ retry:
/* Ask userland hotplug daemon for the device firmware */
rc = mwl8k_request_firmware(priv, fw_image, nowait);
@ -1623,14 +1623,14 @@ upstream submission.
if (ret) {
--- a/drivers/net/wireless/intersil/p54/p54usb.c
+++ b/drivers/net/wireless/intersil/p54/p54usb.c
@@ -929,7 +929,6 @@ static void p54u_load_firmware_cb(const
@@ -931,7 +931,6 @@ static void p54u_load_firmware_cb(const
err = p54u_start_ops(priv);
} else {
err = -ENOENT;
- dev_err(&udev->dev, "Firmware not found.\n");
}
if (err) {
complete(&priv->fw_wait_load);
--- a/drivers/net/wireless/intersil/prism54/islpci_dev.c
+++ b/drivers/net/wireless/intersil/prism54/islpci_dev.c
@@ -92,12 +92,9 @@ isl_upload_firmware(islpci_private *priv
@ -1716,7 +1716,7 @@ upstream submission.
wl1251_error("nvs size is not multiple of 32 bits: %zu",
--- a/drivers/net/wireless/ti/wlcore/main.c
+++ b/drivers/net/wireless/ti/wlcore/main.c
@@ -755,10 +755,8 @@ static int wl12xx_fetch_firmware(struct
@@ -768,10 +768,8 @@ static int wl12xx_fetch_firmware(struct
ret = request_firmware(&fw, fw_name, wl->dev);
@ -1835,7 +1835,7 @@ upstream submission.
}
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -4063,10 +4063,8 @@ static ssize_t ipr_store_update_fw(struc
@@ -4102,10 +4102,8 @@ static ssize_t ipr_store_update_fw(struc
if (endline)
*endline = '\0';
@ -1873,7 +1873,7 @@ upstream submission.
}
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -7275,8 +7275,6 @@ qla2x00_load_risc(scsi_qla_host_t *vha,
@@ -7454,8 +7454,6 @@ qla2x00_load_risc(scsi_qla_host_t *vha,
/* Load firmware blob. */
blob = qla2x00_request_firmware(vha);
if (!blob) {
@ -1882,7 +1882,7 @@ upstream submission.
ql_log(ql_log_info, vha, 0x0084,
"Firmware images can be retrieved from: "QLA_FW_URL ".\n");
return QLA_FUNCTION_FAILED;
@@ -7378,8 +7376,6 @@ qla24xx_load_risc_blob(scsi_qla_host_t *
@@ -7557,8 +7555,6 @@ qla24xx_load_risc_blob(scsi_qla_host_t *
/* Load firmware blob. */
blob = qla2x00_request_firmware(vha);
if (!blob) {
@ -1908,7 +1908,7 @@ upstream submission.
if (qla82xx_validate_firmware_blob(vha,
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -6517,8 +6517,6 @@ qla2x00_request_firmware(scsi_qla_host_t
@@ -6533,8 +6533,6 @@ qla2x00_request_firmware(scsi_qla_host_t
goto out;
if (request_firmware(&blob->fw, blob->name, &ha->pdev->dev)) {
@ -2297,7 +2297,7 @@ upstream submission.
--- a/drivers/usb/serial/ti_usb_3410_5052.c
+++ b/drivers/usb/serial/ti_usb_3410_5052.c
@@ -1692,10 +1692,8 @@ static int ti_download_firmware(struct t
@@ -1693,10 +1693,8 @@ static int ti_download_firmware(struct t
}
check_firmware:
@ -2454,7 +2454,7 @@ upstream submission.
snd_emu1010_fpga_read(emu, EMU_HANA_ID, &reg);
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -1971,10 +1971,8 @@ static void azx_firmware_cb(const struct
@@ -2077,10 +2077,8 @@ static void azx_firmware_cb(const struct
struct azx *chip = card->private_data;
struct pci_dev *pci = chip->pci;
@ -2524,7 +2524,7 @@ upstream submission.
if (err) {
--- a/sound/pci/rme9652/hdsp.c
+++ b/sound/pci/rme9652/hdsp.c
@@ -5132,11 +5132,8 @@ static int hdsp_request_fw_loader(struct
@@ -5134,11 +5134,8 @@ static int hdsp_request_fw_loader(struct
return -EINVAL;
}

View File

@ -1,64 +0,0 @@
From: Denis Efremov <efremov@ispras.ru>
Date: Fri, 12 Jul 2019 21:55:20 +0300
Subject: floppy: fix div-by-zero in setup_format_params
Origin: https://git.kernel.org/linus/f3554aeb991214cbfafd17d55e2bfddb50282e32
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-14284
[ Upstream commit f3554aeb991214cbfafd17d55e2bfddb50282e32 ]
This fixes a divide by zero error in the setup_format_params function of
the floppy driver.
Two consecutive ioctls can trigger the bug: The first one should set the
drive geometry with such .sect and .rate values for the F_SECT_PER_TRACK
to become zero. Next, the floppy format operation should be called.
A floppy disk is not required to be inserted. An unprivileged user
could trigger the bug if the device is accessible.
The patch checks F_SECT_PER_TRACK for a non-zero value in the
set_geometry function. The proper check should involve a reasonable
upper limit for the .sect and .rate fields, but it could change the
UAPI.
The patch also checks F_SECT_PER_TRACK in the setup_format_params, and
cancels the formatting operation in case of zero.
The bug was found by syzkaller.
Signed-off-by: Denis Efremov <efremov@ispras.ru>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/block/floppy.c | 5 +++++
1 file changed, 5 insertions(+)
(limited to 'drivers/block/floppy.c')
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index a8de56f1936d..b1425b218606 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -2119,6 +2119,9 @@ static void setup_format_params(int track)
raw_cmd->kernel_data = floppy_track_buffer;
raw_cmd->length = 4 * F_SECT_PER_TRACK;
+ if (!F_SECT_PER_TRACK)
+ return;
+
/* allow for about 30ms for data transport per track */
head_shift = (F_SECT_PER_TRACK + 5) / 6;
@@ -3243,6 +3246,8 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g,
/* sanity checking for parameters. */
if (g->sect <= 0 ||
g->head <= 0 ||
+ /* check for zero in F_SECT_PER_TRACK */
+ (unsigned char)((g->sect << 2) >> FD_SIZECODE(g)) == 0 ||
g->track <= 0 || g->track > UDP->tracks >> STRETCH(g) ||
/* check if reserved bits are set */
(g->stretch & ~(FD_STRETCH | FD_SWAPSIDES | FD_SECTBASEMASK)) != 0)
--
cgit 1.2-0.3.lf.el7

View File

@ -1,55 +0,0 @@
From: Denis Efremov <efremov@ispras.ru>
Date: Fri, 12 Jul 2019 21:55:23 +0300
Subject: floppy: fix out-of-bounds read in copy_buffer
Origin: https://git.kernel.org/linus/da99466ac243f15fbba65bd261bfc75ffa1532b6
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-14283
[ Upstream commit da99466ac243f15fbba65bd261bfc75ffa1532b6 ]
This fixes a global out-of-bounds read access in the copy_buffer
function of the floppy driver.
The FDDEFPRM ioctl allows one to set the geometry of a disk. The sect
and head fields (unsigned int) of the floppy_drive structure are used to
compute the max_sector (int) in the make_raw_rw_request function. It is
possible to overflow the max_sector. Next, max_sector is passed to the
copy_buffer function and used in one of the memcpy calls.
An unprivileged user could trigger the bug if the device is accessible,
but requires a floppy disk to be inserted.
The patch adds the check for the .sect * .head multiplication for not
overflowing in the set_geometry function.
The bug was found by syzkaller.
Signed-off-by: Denis Efremov <efremov@ispras.ru>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/block/floppy.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
(limited to 'drivers/block/floppy.c')
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 8d69a8af8b78..4a9a4d12721a 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3244,8 +3244,10 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g,
int cnt;
/* sanity checking for parameters. */
- if (g->sect <= 0 ||
- g->head <= 0 ||
+ if ((int)g->sect <= 0 ||
+ (int)g->head <= 0 ||
+ /* check for overflow in max_sector */
+ (int)(g->sect * g->head) <= 0 ||
/* check for zero in F_SECT_PER_TRACK */
(unsigned char)((g->sect << 2) >> FD_SIZECODE(g)) == 0 ||
g->track <= 0 || g->track > UDP->tracks >> STRETCH(g) ||
--
cgit 1.2-0.3.lf.el7

View File

@ -1,82 +0,0 @@
From: Grant Hernandez <granthernandez@google.com>
Date: Sat, 13 Jul 2019 01:00:12 -0700
Subject: Input: gtco - bounds check collection indent level
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d657077eda7b5572d86f2f618391bb016b5d9a64
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-13631
commit 2a017fd82c5402b3c8df5e3d6e5165d9e6147dc1 upstream.
The GTCO tablet input driver configures itself from an HID report sent
via USB during the initial enumeration process. Some debugging messages
are generated during the parsing. A debugging message indentation
counter is not bounds checked, leading to the ability for a specially
crafted HID report to cause '-' and null bytes be written past the end
of the indentation array. As long as the kernel has CONFIG_DYNAMIC_DEBUG
enabled, this code will not be optimized out. This was discovered
during code review after a previous syzkaller bug was found in this
driver.
Signed-off-by: Grant Hernandez <granthernandez@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/input/tablet/gtco.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/drivers/input/tablet/gtco.c b/drivers/input/tablet/gtco.c
index 4b8b9d7aa75e..35031228a6d0 100644
--- a/drivers/input/tablet/gtco.c
+++ b/drivers/input/tablet/gtco.c
@@ -78,6 +78,7 @@ Scott Hill shill@gtcocalcomp.com
/* Max size of a single report */
#define REPORT_MAX_SIZE 10
+#define MAX_COLLECTION_LEVELS 10
/* Bitmask whether pen is in range */
@@ -223,8 +224,7 @@ static void parse_hid_report_descriptor(struct gtco *device, char * report,
char maintype = 'x';
char globtype[12];
int indent = 0;
- char indentstr[10] = "";
-
+ char indentstr[MAX_COLLECTION_LEVELS + 1] = { 0 };
dev_dbg(ddev, "======>>>>>>PARSE<<<<<<======\n");
@@ -350,6 +350,13 @@ static void parse_hid_report_descriptor(struct gtco *device, char * report,
case TAG_MAIN_COL_START:
maintype = 'S';
+ if (indent == MAX_COLLECTION_LEVELS) {
+ dev_err(ddev, "Collection level %d would exceed limit of %d\n",
+ indent + 1,
+ MAX_COLLECTION_LEVELS);
+ break;
+ }
+
if (data == 0) {
dev_dbg(ddev, "======>>>>>> Physical\n");
strcpy(globtype, "Physical");
@@ -369,8 +376,15 @@ static void parse_hid_report_descriptor(struct gtco *device, char * report,
break;
case TAG_MAIN_COL_END:
- dev_dbg(ddev, "<<<<<<======\n");
maintype = 'E';
+
+ if (indent == 0) {
+ dev_err(ddev, "Collection level already at zero\n");
+ break;
+ }
+
+ dev_dbg(ddev, "<<<<<<======\n");
+
indent--;
for (x = 0; x < indent; x++)
indentstr[x] = '-';
--
cgit 1.2-0.3.lf.el7

View File

@ -1,95 +0,0 @@
From: Jiri Kosina <jkosina@suse.cz>
Date: Tue, 14 May 2019 15:41:38 -0700
Subject: mm/mincore.c: make mincore() more conservative
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit?id=f580a54bbd522f2518fd642f7d4d73ad728e5d58
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-5489
commit 134fca9063ad4851de767d1768180e5dede9a881 upstream.
The semantics of what mincore() considers to be resident is not
completely clear, but Linux has always (since 2.3.52, which is when
mincore() was initially done) treated it as "page is available in page
cache".
That's potentially a problem, as that [in]directly exposes
meta-information about pagecache / memory mapping state even about
memory not strictly belonging to the process executing the syscall,
opening possibilities for sidechannel attacks.
Change the semantics of mincore() so that it only reveals pagecache
information for non-anonymous mappings that belog to files that the
calling process could (if it tried to) successfully open for writing;
otherwise we'd be including shared non-exclusive mappings, which
- is the sidechannel
- is not the usecase for mincore(), as that's primarily used for data,
not (shared) text
[jkosina@suse.cz: v2]
Link: http://lkml.kernel.org/r/20190312141708.6652-2-vbabka@suse.cz
[mhocko@suse.com: restructure can_do_mincore() conditions]
Link: http://lkml.kernel.org/r/nycvar.YFH.7.76.1903062342020.19912@cbobk.fhfr.pm
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Josh Snyder <joshs@netflix.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Originally-by: Linus Torvalds <torvalds@linux-foundation.org>
Originally-by: Dominique Martinet <asmadeus@codewreck.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Kevin Easton <kevin@guarana.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Cyril Hrubis <chrubis@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Daniel Gruss <daniel@gruss.cc>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/mincore.c | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/mm/mincore.c b/mm/mincore.c
index fc37afe226e6..2732c8c0764c 100644
--- a/mm/mincore.c
+++ b/mm/mincore.c
@@ -169,6 +169,22 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
return 0;
}
+static inline bool can_do_mincore(struct vm_area_struct *vma)
+{
+ if (vma_is_anonymous(vma))
+ return true;
+ if (!vma->vm_file)
+ return false;
+ /*
+ * Reveal pagecache information only for non-anonymous mappings that
+ * correspond to the files the calling process could (if tried) open
+ * for writing; otherwise we'd be including shared non-exclusive
+ * mappings, which opens a side channel.
+ */
+ return inode_owner_or_capable(file_inode(vma->vm_file)) ||
+ inode_permission(file_inode(vma->vm_file), MAY_WRITE) == 0;
+}
+
/*
* Do a chunk of "sys_mincore()". We've already checked
* all the arguments, we hold the mmap semaphore: we should
@@ -189,8 +205,13 @@ static long do_mincore(unsigned long addr, unsigned long pages, unsigned char *v
vma = find_vma(current->mm, addr);
if (!vma || addr < vma->vm_start)
return -ENOMEM;
- mincore_walk.mm = vma->vm_mm;
end = min(vma->vm_end, addr + (pages << PAGE_SHIFT));
+ if (!can_do_mincore(vma)) {
+ unsigned long pages = DIV_ROUND_UP(end - addr, PAGE_SIZE);
+ memset(vec, 1, pages);
+ return pages;
+ }
+ mincore_walk.mm = vma->vm_mm;
err = walk_page_range(addr, end, &mincore_walk);
if (err < 0)
return err;

View File

@ -1,83 +0,0 @@
From: Takashi Iwai <tiwai@suse.de>
Date: Wed, 29 May 2019 14:52:20 +0200
Subject: mwifiex: Abort at too short BSS descriptor element
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git/commit?id=685c9b7750bfacd6fc1db50d86579980593b7869
Currently mwifiex_update_bss_desc_with_ie() implicitly assumes that
the source descriptor entries contain the enough size for each type
and performs copying without checking the source size. This may lead
to read over boundary.
Fix this by putting the source size check in appropriate places.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
---
drivers/net/wireless/marvell/mwifiex/scan.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/net/wireless/marvell/mwifiex/scan.c b/drivers/net/wireless/marvell/mwifiex/scan.c
index 64ab6fe78c0d..c269a0de9413 100644
--- a/drivers/net/wireless/marvell/mwifiex/scan.c
+++ b/drivers/net/wireless/marvell/mwifiex/scan.c
@@ -1269,6 +1269,8 @@ int mwifiex_update_bss_desc_with_ie(struct mwifiex_adapter *adapter,
break;
case WLAN_EID_FH_PARAMS:
+ if (element_len + 2 < sizeof(*fh_param_set))
+ return -EINVAL;
fh_param_set =
(struct ieee_types_fh_param_set *) current_ptr;
memcpy(&bss_entry->phy_param_set.fh_param_set,
@@ -1277,6 +1279,8 @@ int mwifiex_update_bss_desc_with_ie(struct mwifiex_adapter *adapter,
break;
case WLAN_EID_DS_PARAMS:
+ if (element_len + 2 < sizeof(*ds_param_set))
+ return -EINVAL;
ds_param_set =
(struct ieee_types_ds_param_set *) current_ptr;
@@ -1288,6 +1292,8 @@ int mwifiex_update_bss_desc_with_ie(struct mwifiex_adapter *adapter,
break;
case WLAN_EID_CF_PARAMS:
+ if (element_len + 2 < sizeof(*cf_param_set))
+ return -EINVAL;
cf_param_set =
(struct ieee_types_cf_param_set *) current_ptr;
memcpy(&bss_entry->ss_param_set.cf_param_set,
@@ -1296,6 +1302,8 @@ int mwifiex_update_bss_desc_with_ie(struct mwifiex_adapter *adapter,
break;
case WLAN_EID_IBSS_PARAMS:
+ if (element_len + 2 < sizeof(*ibss_param_set))
+ return -EINVAL;
ibss_param_set =
(struct ieee_types_ibss_param_set *)
current_ptr;
@@ -1305,10 +1313,14 @@ int mwifiex_update_bss_desc_with_ie(struct mwifiex_adapter *adapter,
break;
case WLAN_EID_ERP_INFO:
+ if (!element_len)
+ return -EINVAL;
bss_entry->erp_flags = *(current_ptr + 2);
break;
case WLAN_EID_PWR_CONSTRAINT:
+ if (!element_len)
+ return -EINVAL;
bss_entry->local_constraint = *(current_ptr + 2);
bss_entry->sensed_11h = true;
break;
@@ -1349,6 +1361,9 @@ int mwifiex_update_bss_desc_with_ie(struct mwifiex_adapter *adapter,
break;
case WLAN_EID_VENDOR_SPECIFIC:
+ if (element_len + 2 < sizeof(vendor_ie->vend_hdr))
+ return -EINVAL;
+
vendor_ie = (struct ieee_types_vendor_specific *)
current_ptr;

View File

@ -1,135 +0,0 @@
From: Brian Norris <briannorris@chromium.org>
Subject: [PATCH 5.2 1/2] mwifiex: Don't abort on small,
spec-compliant vendor IEs
Date: Fri, 14 Jun 2019 17:13:20 -0700
Origin: https://patchwork.kernel.org/patch/10996895/
Per the 802.11 specification, vendor IEs are (at minimum) only required
to contain an OUI. A type field is also included in ieee80211.h (struct
ieee80211_vendor_ie) but doesn't appear in the specification. The
remaining fields (subtype, version) are a convention used in WMM
headers.
Thus, we should not reject vendor-specific IEs that have only the
minimum length (3 bytes) -- we should skip over them (since we only want
to match longer IEs, that match either WMM or WPA formats). We can
reject elements that don't have the minimum-required 3 byte OUI.
While we're at it, move the non-standard subtype and version fields into
the WMM structs, to avoid this confusion in the future about generic
"vendor header" attributes.
Fixes: 685c9b7750bf ("mwifiex: Abort at too short BSS descriptor element")
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Brian Norris <briannorris@chromium.org>
---
It appears that commit 685c9b7750bf is on its way to 5.2, so I labeled
this bugfix for 5.2 as well.
drivers/net/wireless/marvell/mwifiex/fw.h | 12 +++++++++---
drivers/net/wireless/marvell/mwifiex/scan.c | 18 +++++++++++-------
.../net/wireless/marvell/mwifiex/sta_ioctl.c | 4 ++--
drivers/net/wireless/marvell/mwifiex/wmm.c | 2 +-
4 files changed, 23 insertions(+), 13 deletions(-)
--- a/drivers/net/wireless/marvell/mwifiex/fw.h
+++ b/drivers/net/wireless/marvell/mwifiex/fw.h
@@ -1759,9 +1759,10 @@ struct mwifiex_ie_types_wmm_queue_status
struct ieee_types_vendor_header {
u8 element_id;
u8 len;
- u8 oui[4]; /* 0~2: oui, 3: oui_type */
- u8 oui_subtype;
- u8 version;
+ struct {
+ u8 oui[3];
+ u8 oui_type;
+ } __packed oui;
} __packed;
struct ieee_types_wmm_parameter {
@@ -1775,6 +1776,9 @@ struct ieee_types_wmm_parameter {
* Version [1]
*/
struct ieee_types_vendor_header vend_hdr;
+ u8 oui_subtype;
+ u8 version;
+
u8 qos_info_bitmap;
u8 reserved;
struct ieee_types_wmm_ac_parameters ac_params[IEEE80211_NUM_ACS];
@@ -1792,6 +1796,8 @@ struct ieee_types_wmm_info {
* Version [1]
*/
struct ieee_types_vendor_header vend_hdr;
+ u8 oui_subtype;
+ u8 version;
u8 qos_info_bitmap;
} __packed;
--- a/drivers/net/wireless/marvell/mwifiex/scan.c
+++ b/drivers/net/wireless/marvell/mwifiex/scan.c
@@ -1361,21 +1361,25 @@ int mwifiex_update_bss_desc_with_ie(stru
break;
case WLAN_EID_VENDOR_SPECIFIC:
- if (element_len + 2 < sizeof(vendor_ie->vend_hdr))
- return -EINVAL;
-
vendor_ie = (struct ieee_types_vendor_specific *)
current_ptr;
- if (!memcmp
- (vendor_ie->vend_hdr.oui, wpa_oui,
- sizeof(wpa_oui))) {
+ /* 802.11 requires at least 3-byte OUI. */
+ if (element_len < sizeof(vendor_ie->vend_hdr.oui.oui))
+ return -EINVAL;
+
+ /* Not long enough for a match? Skip it. */
+ if (element_len < sizeof(wpa_oui))
+ break;
+
+ if (!memcmp(&vendor_ie->vend_hdr.oui, wpa_oui,
+ sizeof(wpa_oui))) {
bss_entry->bcn_wpa_ie =
(struct ieee_types_vendor_specific *)
current_ptr;
bss_entry->wpa_offset = (u16)
(current_ptr - bss_entry->beacon_buf);
- } else if (!memcmp(vendor_ie->vend_hdr.oui, wmm_oui,
+ } else if (!memcmp(&vendor_ie->vend_hdr.oui, wmm_oui,
sizeof(wmm_oui))) {
if (total_ie_len ==
sizeof(struct ieee_types_wmm_parameter) ||
--- a/drivers/net/wireless/marvell/mwifiex/sta_ioctl.c
+++ b/drivers/net/wireless/marvell/mwifiex/sta_ioctl.c
@@ -1348,7 +1348,7 @@ mwifiex_set_gen_ie_helper(struct mwifiex
/* Test to see if it is a WPA IE, if not, then
* it is a gen IE
*/
- if (!memcmp(pvendor_ie->oui, wpa_oui,
+ if (!memcmp(&pvendor_ie->oui, wpa_oui,
sizeof(wpa_oui))) {
/* IE is a WPA/WPA2 IE so call set_wpa function
*/
@@ -1358,7 +1358,7 @@ mwifiex_set_gen_ie_helper(struct mwifiex
goto next_ie;
}
- if (!memcmp(pvendor_ie->oui, wps_oui,
+ if (!memcmp(&pvendor_ie->oui, wps_oui,
sizeof(wps_oui))) {
/* Test to see if it is a WPS IE,
* if so, enable wps session flag
--- a/drivers/net/wireless/marvell/mwifiex/wmm.c
+++ b/drivers/net/wireless/marvell/mwifiex/wmm.c
@@ -240,7 +240,7 @@ mwifiex_wmm_setup_queue_priorities(struc
mwifiex_dbg(priv->adapter, INFO,
"info: WMM Parameter IE: version=%d,\t"
"qos_info Parameter Set Count=%d, Reserved=%#x\n",
- wmm_ie->vend_hdr.version, wmm_ie->qos_info_bitmap &
+ wmm_ie->version, wmm_ie->qos_info_bitmap &
IEEE80211_WMM_IE_AP_QOSINFO_PARAM_SET_CNT_MASK,
wmm_ie->reserved);

View File

@ -1,118 +0,0 @@
From: Takashi Iwai <tiwai@suse.de>
Date: Fri, 31 May 2019 15:18:41 +0200
Subject: mwifiex: Fix heap overflow in mwifiex_uap_parse_tail_ies()
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git/commit?id=69ae4f6aac1578575126319d3f55550e7e440449
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-10126
A few places in mwifiex_uap_parse_tail_ies() perform memcpy()
unconditionally, which may lead to either buffer overflow or read over
boundary.
This patch addresses the issues by checking the read size and the
destination size at each place more properly. Along with the fixes,
the patch cleans up the code slightly by introducing a temporary
variable for the token size, and unifies the error path with the
standard goto statement.
Reported-by: huangwen <huangwen@venustech.com.cn>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
[bwh: Backported to 4.19: adjust context]
---
drivers/net/wireless/marvell/mwifiex/ie.c | 47 +++++++++++++++--------
1 file changed, 31 insertions(+), 16 deletions(-)
--- a/drivers/net/wireless/marvell/mwifiex/ie.c
+++ b/drivers/net/wireless/marvell/mwifiex/ie.c
@@ -329,6 +329,8 @@ static int mwifiex_uap_parse_tail_ies(st
struct ieee80211_vendor_ie *vendorhdr;
u16 gen_idx = MWIFIEX_AUTO_IDX_MASK, ie_len = 0;
int left_len, parsed_len = 0;
+ unsigned int token_len;
+ int err = 0;
if (!info->tail || !info->tail_len)
return 0;
@@ -344,6 +346,12 @@ static int mwifiex_uap_parse_tail_ies(st
*/
while (left_len > sizeof(struct ieee_types_header)) {
hdr = (void *)(info->tail + parsed_len);
+ token_len = hdr->len + sizeof(struct ieee_types_header);
+ if (token_len > left_len) {
+ err = -EINVAL;
+ goto out;
+ }
+
switch (hdr->element_id) {
case WLAN_EID_SSID:
case WLAN_EID_SUPP_RATES:
@@ -361,16 +369,19 @@ static int mwifiex_uap_parse_tail_ies(st
if (cfg80211_find_vendor_ie(WLAN_OUI_MICROSOFT,
WLAN_OUI_TYPE_MICROSOFT_WMM,
(const u8 *)hdr,
- hdr->len + sizeof(struct ieee_types_header)))
+ token_len))
break;
default:
- memcpy(gen_ie->ie_buffer + ie_len, hdr,
- hdr->len + sizeof(struct ieee_types_header));
- ie_len += hdr->len + sizeof(struct ieee_types_header);
+ if (ie_len + token_len > IEEE_MAX_IE_SIZE) {
+ err = -EINVAL;
+ goto out;
+ }
+ memcpy(gen_ie->ie_buffer + ie_len, hdr, token_len);
+ ie_len += token_len;
break;
}
- left_len -= hdr->len + sizeof(struct ieee_types_header);
- parsed_len += hdr->len + sizeof(struct ieee_types_header);
+ left_len -= token_len;
+ parsed_len += token_len;
}
/* parse only WPA vendor IE from tail, WMM IE is configured by
@@ -380,15 +391,17 @@ static int mwifiex_uap_parse_tail_ies(st
WLAN_OUI_TYPE_MICROSOFT_WPA,
info->tail, info->tail_len);
if (vendorhdr) {
- memcpy(gen_ie->ie_buffer + ie_len, vendorhdr,
- vendorhdr->len + sizeof(struct ieee_types_header));
- ie_len += vendorhdr->len + sizeof(struct ieee_types_header);
+ token_len = vendorhdr->len + sizeof(struct ieee_types_header);
+ if (ie_len + token_len > IEEE_MAX_IE_SIZE) {
+ err = -EINVAL;
+ goto out;
+ }
+ memcpy(gen_ie->ie_buffer + ie_len, vendorhdr, token_len);
+ ie_len += token_len;
}
- if (!ie_len) {
- kfree(gen_ie);
- return 0;
- }
+ if (!ie_len)
+ goto out;
gen_ie->ie_index = cpu_to_le16(gen_idx);
gen_ie->mgmt_subtype_mask = cpu_to_le16(MGMT_MASK_BEACON |
@@ -398,13 +411,15 @@ static int mwifiex_uap_parse_tail_ies(st
if (mwifiex_update_uap_custom_ie(priv, gen_ie, &gen_idx, NULL, NULL,
NULL, NULL)) {
- kfree(gen_ie);
- return -1;
+ err = -EINVAL;
+ goto out;
}
priv->gen_idx = gen_idx;
+
+ out:
kfree(gen_ie);
- return 0;
+ return err;
}
/* This function parses different IEs-head & tail IEs, beacon IEs,

View File

@ -1,44 +0,0 @@
From: Takashi Iwai <tiwai@suse.de>
Date: Wed, 29 May 2019 14:52:19 +0200
Subject: mwifiex: Fix possible buffer overflows at parsing bss descriptor
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git/commit?id=13ec7f10b87f5fc04c4ccbd491c94c7980236a74
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-3846
mwifiex_update_bss_desc_with_ie() calls memcpy() unconditionally in
a couple places without checking the destination size. Since the
source is given from user-space, this may trigger a heap buffer
overflow.
Fix it by putting the length check before performing memcpy().
This fix addresses CVE-2019-3846.
Reported-by: huangwen <huangwen@venustech.com.cn>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
---
drivers/net/wireless/marvell/mwifiex/scan.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/wireless/marvell/mwifiex/scan.c b/drivers/net/wireless/marvell/mwifiex/scan.c
index 935778ec9a1b..64ab6fe78c0d 100644
--- a/drivers/net/wireless/marvell/mwifiex/scan.c
+++ b/drivers/net/wireless/marvell/mwifiex/scan.c
@@ -1247,6 +1247,8 @@ int mwifiex_update_bss_desc_with_ie(struct mwifiex_adapter *adapter,
}
switch (element_id) {
case WLAN_EID_SSID:
+ if (element_len > IEEE80211_MAX_SSID_LEN)
+ return -EINVAL;
bss_entry->ssid.ssid_len = element_len;
memcpy(bss_entry->ssid.ssid, (current_ptr + 2),
element_len);
@@ -1256,6 +1258,8 @@ int mwifiex_update_bss_desc_with_ie(struct mwifiex_adapter *adapter,
break;
case WLAN_EID_SUPP_RATES:
+ if (element_len > MWIFIEX_SUPPORTED_RATES)
+ return -EINVAL;
memcpy(bss_entry->data_rates, current_ptr + 2,
element_len);
memcpy(bss_entry->supported_rates, current_ptr + 2,

View File

@ -1,162 +0,0 @@
From: Eric Dumazet <edumazet@google.com>
Date: Wed, 27 Mar 2019 12:40:33 -0700
Subject: inet: switch IP ID generator to siphash
Origin: https://git.kernel.org/linus/df453700e8d81b1bdafdf684365ee2b9431fb702
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-10638
[ Upstream commit df453700e8d81b1bdafdf684365ee2b9431fb702 ]
According to Amit Klein and Benny Pinkas, IP ID generation is too weak
and might be used by attackers.
Even with recent net_hash_mix() fix (netns: provide pure entropy for net_hash_mix())
having 64bit key and Jenkins hash is risky.
It is time to switch to siphash and its 128bit keys.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Amit Klein <aksecurity@gmail.com>
Reported-by: Benny Pinkas <benny@pinkas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/siphash.h | 5 +++++
include/net/netns/ipv4.h | 2 ++
net/ipv4/route.c | 12 +++++++-----
net/ipv6/output_core.c | 30 ++++++++++++++++--------------
4 files changed, 30 insertions(+), 19 deletions(-)
diff --git a/include/linux/siphash.h b/include/linux/siphash.h
index fa7a6b9cedbf..bf21591a9e5e 100644
--- a/include/linux/siphash.h
+++ b/include/linux/siphash.h
@@ -21,6 +21,11 @@ typedef struct {
u64 key[2];
} siphash_key_t;
+static inline bool siphash_key_is_zero(const siphash_key_t *key)
+{
+ return !(key->key[0] | key->key[1]);
+}
+
u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t *key);
#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t *key);
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index e47503b4e4d1..622db6bc2f02 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -9,6 +9,7 @@
#include <linux/uidgid.h>
#include <net/inet_frag.h>
#include <linux/rcupdate.h>
+#include <linux/siphash.h>
struct tcpm_hash_bucket;
struct ctl_table_header;
@@ -214,5 +215,6 @@ struct netns_ipv4 {
unsigned int ipmr_seq; /* protected by rtnl_mutex */
atomic_t rt_genid;
+ siphash_key_t ip_id_key;
};
#endif
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 8bacbcd2db90..40bf19f7ae1a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -500,15 +500,17 @@ EXPORT_SYMBOL(ip_idents_reserve);
void __ip_select_ident(struct net *net, struct iphdr *iph, int segs)
{
- static u32 ip_idents_hashrnd __read_mostly;
u32 hash, id;
- net_get_random_once(&ip_idents_hashrnd, sizeof(ip_idents_hashrnd));
+ /* Note the following code is not safe, but this is okay. */
+ if (unlikely(siphash_key_is_zero(&net->ipv4.ip_id_key)))
+ get_random_bytes(&net->ipv4.ip_id_key,
+ sizeof(net->ipv4.ip_id_key));
- hash = jhash_3words((__force u32)iph->daddr,
+ hash = siphash_3u32((__force u32)iph->daddr,
(__force u32)iph->saddr,
- iph->protocol ^ net_hash_mix(net),
- ip_idents_hashrnd);
+ iph->protocol,
+ &net->ipv4.ip_id_key);
id = ip_idents_reserve(hash, segs);
iph->id = htons(id);
}
diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c
index 4fe7c90962dd..868ae23dbae1 100644
--- a/net/ipv6/output_core.c
+++ b/net/ipv6/output_core.c
@@ -10,15 +10,25 @@
#include <net/secure_seq.h>
#include <linux/netfilter.h>
-static u32 __ipv6_select_ident(struct net *net, u32 hashrnd,
+static u32 __ipv6_select_ident(struct net *net,
const struct in6_addr *dst,
const struct in6_addr *src)
{
+ const struct {
+ struct in6_addr dst;
+ struct in6_addr src;
+ } __aligned(SIPHASH_ALIGNMENT) combined = {
+ .dst = *dst,
+ .src = *src,
+ };
u32 hash, id;
- hash = __ipv6_addr_jhash(dst, hashrnd);
- hash = __ipv6_addr_jhash(src, hash);
- hash ^= net_hash_mix(net);
+ /* Note the following code is not safe, but this is okay. */
+ if (unlikely(siphash_key_is_zero(&net->ipv4.ip_id_key)))
+ get_random_bytes(&net->ipv4.ip_id_key,
+ sizeof(net->ipv4.ip_id_key));
+
+ hash = siphash(&combined, sizeof(combined), &net->ipv4.ip_id_key);
/* Treat id of 0 as unset and if we get 0 back from ip_idents_reserve,
* set the hight order instead thus minimizing possible future
@@ -41,7 +51,6 @@ static u32 __ipv6_select_ident(struct net *net, u32 hashrnd,
*/
__be32 ipv6_proxy_select_ident(struct net *net, struct sk_buff *skb)
{
- static u32 ip6_proxy_idents_hashrnd __read_mostly;
struct in6_addr buf[2];
struct in6_addr *addrs;
u32 id;
@@ -53,11 +62,7 @@ __be32 ipv6_proxy_select_ident(struct net *net, struct sk_buff *skb)
if (!addrs)
return 0;
- net_get_random_once(&ip6_proxy_idents_hashrnd,
- sizeof(ip6_proxy_idents_hashrnd));
-
- id = __ipv6_select_ident(net, ip6_proxy_idents_hashrnd,
- &addrs[1], &addrs[0]);
+ id = __ipv6_select_ident(net, &addrs[1], &addrs[0]);
return htonl(id);
}
EXPORT_SYMBOL_GPL(ipv6_proxy_select_ident);
@@ -66,12 +71,9 @@ __be32 ipv6_select_ident(struct net *net,
const struct in6_addr *daddr,
const struct in6_addr *saddr)
{
- static u32 ip6_idents_hashrnd __read_mostly;
u32 id;
- net_get_random_once(&ip6_idents_hashrnd, sizeof(ip6_idents_hashrnd));
-
- id = __ipv6_select_ident(net, ip6_idents_hashrnd, daddr, saddr);
+ id = __ipv6_select_ident(net, daddr, saddr);
return htonl(id);
}
EXPORT_SYMBOL(ipv6_select_ident);
--
cgit 1.2-0.3.lf.el7

View File

@ -1,36 +0,0 @@
From: Young Xiao <92siuyang@gmail.com>
Date: Fri, 14 Jun 2019 15:13:02 +0800
Subject: nfc: Ensure presence of required attributes in the deactivate_target
handler
Origin: https://git.kernel.org/linus/385097a3675749cbc9e97c085c0e5dfe4269ca51
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-12984
Check that the NFC_ATTR_TARGET_INDEX attributes (in addition to
NFC_ATTR_DEVICE_INDEX) are provided by the netlink client prior to
accessing them. This prevents potential unhandled NULL pointer dereference
exceptions which can be triggered by malicious user-mode programs,
if they omit one or both of these attributes.
Signed-off-by: Young Xiao <92siuyang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/nfc/netlink.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/nfc/netlink.c b/net/nfc/netlink.c
index 1180b3e58a0a..ea64c90b14e8 100644
--- a/net/nfc/netlink.c
+++ b/net/nfc/netlink.c
@@ -911,7 +911,8 @@ static int nfc_genl_deactivate_target(struct sk_buff *skb,
u32 device_idx, target_idx;
int rc;
- if (!info->attrs[NFC_ATTR_DEVICE_INDEX])
+ if (!info->attrs[NFC_ATTR_DEVICE_INDEX] ||
+ !info->attrs[NFC_ATTR_TARGET_INDEX])
return -EINVAL;
device_idx = nla_get_u32(info->attrs[NFC_ATTR_DEVICE_INDEX]);
--
cgit 1.2-0.3.lf.el7

View File

@ -1,57 +0,0 @@
From: Jann Horn <jannh@google.com>
Date: Thu, 4 Jul 2019 17:32:23 +0200
Subject: ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME
Origin: https://git.kernel.org/linus/6994eefb0053799d2e07cd140df6c2ea106c41ee
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-13272
Fix two issues:
When called for PTRACE_TRACEME, ptrace_link() would obtain an RCU
reference to the parent's objective credentials, then give that pointer
to get_cred(). However, the object lifetime rules for things like
struct cred do not permit unconditionally turning an RCU reference into
a stable reference.
PTRACE_TRACEME records the parent's credentials as if the parent was
acting as the subject, but that's not the case. If a malicious
unprivileged child uses PTRACE_TRACEME and the parent is privileged, and
at a later point, the parent process becomes attacker-controlled
(because it drops privileges and calls execve()), the attacker ends up
with control over two processes with a privileged ptrace relationship,
which can be abused to ptrace a suid binary and obtain root privileges.
Fix both of these by always recording the credentials of the process
that is requesting the creation of the ptrace relationship:
current_cred() can't change under us, and current is the proper subject
for access control.
This change is theoretically userspace-visible, but I am not aware of
any code that it will actually break.
Fixes: 64b875f7ac8a ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
kernel/ptrace.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 8456b6e2205f..705887f63288 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -79,9 +79,7 @@ void __ptrace_link(struct task_struct *child, struct task_struct *new_parent,
*/
static void ptrace_link(struct task_struct *child, struct task_struct *new_parent)
{
- rcu_read_lock();
- __ptrace_link(child, new_parent, __task_cred(new_parent));
- rcu_read_unlock();
+ __ptrace_link(child, new_parent, current_cred());
}
/**
--
2.20.1

View File

@ -1,35 +0,0 @@
From: Ben Hutchings <ben@decadent.org.uk>
Date: Tue, 9 Apr 2019 01:01:56 +0100
Subject: Revert "net: stmmac: Send TSO packets always from Queue 0"
Forwarded: https://lore.kernel.org/lkml/a5f9b02fbb5ca830e598f1c601cdbecc6c86b789.camel@decadent.org.uk/T/#u
This reverts commit 496eaed7fe94df7202d7cbe37873f96bcdda375e, which
was commit c5acdbee22a1b200dde07effd26fd1f649e9ab8a upstream. This
introduces data races.
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 886176be818e..8c3e228b1da6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3033,17 +3033,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
/* Manage oversized TCP frames for GMAC4 device */
if (skb_is_gso(skb) && priv->tso) {
- if (skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)) {
- /*
- * There is no way to determine the number of TSO
- * capable Queues. Let's use always the Queue 0
- * because if TSO is supported then at least this
- * one will be capable.
- */
- skb_set_queue_mapping(skb, 0);
-
+ if (skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))
return stmmac_tso_xmit(skb, dev);
- }
}
if (unlikely(stmmac_tx_avail(priv, queue) < nfrags + 1)) {

View File

@ -1,65 +0,0 @@
From b90cd6f2b905905fb42671009dc0e27c310a16ae Mon Sep 17 00:00:00 2001
From: Jason Yan <yanaijie@huawei.com>
Date: Tue, 25 Sep 2018 10:56:54 +0800
Subject: scsi: libsas: fix a race condition when smp task timeout
Origin: https://git.kernel.org/linus/b90cd6f2b905905fb42671009dc0e27c310a16ae
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2018-20836
When the lldd is processing the complete sas task in interrupt and set the
task stat as SAS_TASK_STATE_DONE, the smp timeout timer is able to be
triggered at the same time. And smp_task_timedout() will complete the task
wheter the SAS_TASK_STATE_DONE is set or not. Then the sas task may freed
before lldd end the interrupt process. Thus a use-after-free will happen.
Fix this by calling the complete() only when SAS_TASK_STATE_DONE is not
set. And remove the check of the return value of the del_timer(). Once the
LLDD sets DONE, it must call task->done(), which will call
smp_task_done()->complete() and the task will be completed and freed
correctly.
Reported-by: chenxiang <chenxiang66@hisilicon.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
CC: John Garry <john.garry@huawei.com>
CC: Johannes Thumshirn <jthumshirn@suse.de>
CC: Ewan Milne <emilne@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: Tomas Henzl <thenzl@redhat.com>
CC: Dan Williams <dan.j.williams@intel.com>
CC: Hannes Reinecke <hare@suse.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
drivers/scsi/libsas/sas_expander.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index 52222940d398..0d1f72752ca2 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -48,17 +48,16 @@ static void smp_task_timedout(struct timer_list *t)
unsigned long flags;
spin_lock_irqsave(&task->task_state_lock, flags);
- if (!(task->task_state_flags & SAS_TASK_STATE_DONE))
+ if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) {
task->task_state_flags |= SAS_TASK_STATE_ABORTED;
+ complete(&task->slow_task->completion);
+ }
spin_unlock_irqrestore(&task->task_state_lock, flags);
-
- complete(&task->slow_task->completion);
}
static void smp_task_done(struct sas_task *task)
{
- if (!del_timer(&task->slow_task->timer))
- return;
+ del_timer(&task->slow_task->timer);
complete(&task->slow_task->completion);
}
--
cgit 1.2-0.3.lf.el7

View File

@ -1,40 +0,0 @@
From 5a3e9a68c76f16a4d3d95b5703f8c0cbf1ce526c Mon Sep 17 00:00:00 2001
From: Salvatore Bonaccorso <carnil@debian.org>
Date: Wed, 15 Aug 2018 07:46:04 +0200
Subject: [PATCH 01/30] Documentation/l1tf: Fix small spelling typo
commit 60ca05c3b44566b70d64fbb8e87a6e0c67725468 upstream
Fix small typo (wiil -> will) in the "3.4. Nested virtual machines"
section.
Fixes: 5b76a3cff011 ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry")
Cc: linux-kernel@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-doc@vger.kernel.org
Cc: trivial@kernel.org
Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
Documentation/admin-guide/l1tf.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/l1tf.rst b/Documentation/admin-guide/l1tf.rst
index 9f5924f81f89..9af977384168 100644
--- a/Documentation/admin-guide/l1tf.rst
+++ b/Documentation/admin-guide/l1tf.rst
@@ -556,7 +556,7 @@ carefully analyzed. For full protection the following methods are
the bare metal hypervisor, the nested hypervisor and the nested virtual
machine. VMENTER operations from the nested hypervisor into the nested
guest will always be processed by the bare metal hypervisor. If KVM is the
-bare metal hypervisor it wiil:
+bare metal hypervisor it will:
- Flush the L1D cache on every switch from the nested hypervisor to the
nested virtual machine, so that the nested hypervisor's secrets are not

View File

@ -1,782 +0,0 @@
From fae627376bb7cb519a54a1ac372623653a170317 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue, 7 Aug 2018 10:17:27 -0700
Subject: [PATCH 02/30] x86/cpu: Sanitize FAM6_ATOM naming
commit f2c4db1bd80720cd8cb2a5aa220d9bc9f374f04e upstream
Going primarily by:
https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors
with additional information gleaned from other related pages; notably:
- Bonnell shrink was called Saltwell
- Moorefield is the Merriefield refresh which makes it Airmont
The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE
for i in `git grep -l FAM6_ATOM` ; do
sed -i -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g' \
-e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/' \
-e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g' \
-e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g' \
-e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g' \
-e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g' \
-e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g' \
-e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g' \
-e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g' \
-e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g' \
-e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i}
done
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: dave.hansen@linux.intel.com
Cc: len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/events/intel/core.c | 20 ++++----
arch/x86/events/intel/cstate.c | 8 ++--
arch/x86/events/intel/rapl.c | 4 +-
arch/x86/events/msr.c | 8 ++--
arch/x86/include/asm/intel-family.h | 33 ++++++-------
arch/x86/kernel/cpu/common.c | 28 +++++------
arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 4 +-
arch/x86/kernel/tsc.c | 2 +-
arch/x86/kernel/tsc_msr.c | 10 ++--
arch/x86/platform/atom/punit_atom_debug.c | 4 +-
.../intel-mid/device_libs/platform_bt.c | 2 +-
drivers/acpi/acpi_lpss.c | 2 +-
drivers/acpi/x86/utils.c | 2 +-
drivers/cpufreq/intel_pstate.c | 4 +-
drivers/edac/pnd2_edac.c | 2 +-
drivers/idle/intel_idle.c | 18 ++++----
drivers/mmc/host/sdhci-acpi.c | 2 +-
drivers/pci/pci-mid.c | 4 +-
drivers/platform/x86/intel_int0002_vgpio.c | 2 +-
drivers/platform/x86/intel_mid_powerbtn.c | 4 +-
.../platform/x86/intel_telemetry_debugfs.c | 2 +-
drivers/platform/x86/intel_telemetry_pltdrv.c | 2 +-
drivers/powercap/intel_rapl.c | 10 ++--
drivers/thermal/intel_soc_dts_thermal.c | 2 +-
sound/soc/intel/boards/bytcr_rt5651.c | 2 +-
tools/power/x86/turbostat/turbostat.c | 46 +++++++++----------
26 files changed, 115 insertions(+), 112 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 3dd204d1dd19..a82c7b655d77 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4126,11 +4126,11 @@ __init int intel_pmu_init(void)
name = "nehalem";
break;
- case INTEL_FAM6_ATOM_PINEVIEW:
- case INTEL_FAM6_ATOM_LINCROFT:
- case INTEL_FAM6_ATOM_PENWELL:
- case INTEL_FAM6_ATOM_CLOVERVIEW:
- case INTEL_FAM6_ATOM_CEDARVIEW:
+ case INTEL_FAM6_ATOM_BONNELL:
+ case INTEL_FAM6_ATOM_BONNELL_MID:
+ case INTEL_FAM6_ATOM_SALTWELL:
+ case INTEL_FAM6_ATOM_SALTWELL_MID:
+ case INTEL_FAM6_ATOM_SALTWELL_TABLET:
memcpy(hw_cache_event_ids, atom_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
@@ -4143,9 +4143,11 @@ __init int intel_pmu_init(void)
name = "bonnell";
break;
- case INTEL_FAM6_ATOM_SILVERMONT1:
- case INTEL_FAM6_ATOM_SILVERMONT2:
+ case INTEL_FAM6_ATOM_SILVERMONT:
+ case INTEL_FAM6_ATOM_SILVERMONT_X:
+ case INTEL_FAM6_ATOM_SILVERMONT_MID:
case INTEL_FAM6_ATOM_AIRMONT:
+ case INTEL_FAM6_ATOM_AIRMONT_MID:
memcpy(hw_cache_event_ids, slm_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, slm_hw_cache_extra_regs,
@@ -4164,7 +4166,7 @@ __init int intel_pmu_init(void)
break;
case INTEL_FAM6_ATOM_GOLDMONT:
- case INTEL_FAM6_ATOM_DENVERTON:
+ case INTEL_FAM6_ATOM_GOLDMONT_X:
memcpy(hw_cache_event_ids, glm_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, glm_hw_cache_extra_regs,
@@ -4190,7 +4192,7 @@ __init int intel_pmu_init(void)
name = "goldmont";
break;
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
memcpy(hw_cache_event_ids, glp_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, glp_hw_cache_extra_regs,
diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 6eb76106c469..56194c571299 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -559,8 +559,8 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = {
X86_CSTATES_MODEL(INTEL_FAM6_HASWELL_ULT, hswult_cstates),
- X86_CSTATES_MODEL(INTEL_FAM6_ATOM_SILVERMONT1, slm_cstates),
- X86_CSTATES_MODEL(INTEL_FAM6_ATOM_SILVERMONT2, slm_cstates),
+ X86_CSTATES_MODEL(INTEL_FAM6_ATOM_SILVERMONT, slm_cstates),
+ X86_CSTATES_MODEL(INTEL_FAM6_ATOM_SILVERMONT_X, slm_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_ATOM_AIRMONT, slm_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_BROADWELL_CORE, snb_cstates),
@@ -581,9 +581,9 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = {
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNM, knl_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT, glm_cstates),
- X86_CSTATES_MODEL(INTEL_FAM6_ATOM_DENVERTON, glm_cstates),
+ X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_X, glm_cstates),
- X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GEMINI_LAKE, glm_cstates),
+ X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_PLUS, glm_cstates),
{ },
};
MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match);
diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 32f3e9423e99..91039ffed633 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -777,9 +777,9 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_CANNONLAKE_MOBILE, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
- X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_DENVERTON, hsw_rapl_init),
+ X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT_X, hsw_rapl_init),
- X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GEMINI_LAKE, hsw_rapl_init),
+ X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT_PLUS, hsw_rapl_init),
{},
};
diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index b4771a6ddbc1..1b9f85abf9bc 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -69,14 +69,14 @@ static bool test_intel(int idx)
case INTEL_FAM6_BROADWELL_GT3E:
case INTEL_FAM6_BROADWELL_X:
- case INTEL_FAM6_ATOM_SILVERMONT1:
- case INTEL_FAM6_ATOM_SILVERMONT2:
+ case INTEL_FAM6_ATOM_SILVERMONT:
+ case INTEL_FAM6_ATOM_SILVERMONT_X:
case INTEL_FAM6_ATOM_AIRMONT:
case INTEL_FAM6_ATOM_GOLDMONT:
- case INTEL_FAM6_ATOM_DENVERTON:
+ case INTEL_FAM6_ATOM_GOLDMONT_X:
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
case INTEL_FAM6_XEON_PHI_KNL:
case INTEL_FAM6_XEON_PHI_KNM:
diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
index 0ad25cc895ae..058b1a1994c4 100644
--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -8,9 +8,6 @@
* The "_X" parts are generally the EP and EX Xeons, or the
* "Extreme" ones, like Broadwell-E.
*
- * Things ending in "2" are usually because we have no better
- * name for them. There's no processor called "SILVERMONT2".
- *
* While adding a new CPUID for a new microarchitecture, add a new
* group to keep logically sorted out in chronological order. Within
* that group keep the CPUID for the variants sorted by model number.
@@ -59,19 +56,23 @@
/* "Small Core" Processors (Atom) */
-#define INTEL_FAM6_ATOM_PINEVIEW 0x1C
-#define INTEL_FAM6_ATOM_LINCROFT 0x26
-#define INTEL_FAM6_ATOM_PENWELL 0x27
-#define INTEL_FAM6_ATOM_CLOVERVIEW 0x35
-#define INTEL_FAM6_ATOM_CEDARVIEW 0x36
-#define INTEL_FAM6_ATOM_SILVERMONT1 0x37 /* BayTrail/BYT / Valleyview */
-#define INTEL_FAM6_ATOM_SILVERMONT2 0x4D /* Avaton/Rangely */
-#define INTEL_FAM6_ATOM_AIRMONT 0x4C /* CherryTrail / Braswell */
-#define INTEL_FAM6_ATOM_MERRIFIELD 0x4A /* Tangier */
-#define INTEL_FAM6_ATOM_MOOREFIELD 0x5A /* Anniedale */
-#define INTEL_FAM6_ATOM_GOLDMONT 0x5C
-#define INTEL_FAM6_ATOM_DENVERTON 0x5F /* Goldmont Microserver */
-#define INTEL_FAM6_ATOM_GEMINI_LAKE 0x7A
+#define INTEL_FAM6_ATOM_BONNELL 0x1C /* Diamondville, Pineview */
+#define INTEL_FAM6_ATOM_BONNELL_MID 0x26 /* Silverthorne, Lincroft */
+
+#define INTEL_FAM6_ATOM_SALTWELL 0x36 /* Cedarview */
+#define INTEL_FAM6_ATOM_SALTWELL_MID 0x27 /* Penwell */
+#define INTEL_FAM6_ATOM_SALTWELL_TABLET 0x35 /* Cloverview */
+
+#define INTEL_FAM6_ATOM_SILVERMONT 0x37 /* Bay Trail, Valleyview */
+#define INTEL_FAM6_ATOM_SILVERMONT_X 0x4D /* Avaton, Rangely */
+#define INTEL_FAM6_ATOM_SILVERMONT_MID 0x4A /* Merriefield */
+
+#define INTEL_FAM6_ATOM_AIRMONT 0x4C /* Cherry Trail, Braswell */
+#define INTEL_FAM6_ATOM_AIRMONT_MID 0x5A /* Moorefield */
+
+#define INTEL_FAM6_ATOM_GOLDMONT 0x5C /* Apollo Lake */
+#define INTEL_FAM6_ATOM_GOLDMONT_X 0x5F /* Denverton */
+#define INTEL_FAM6_ATOM_GOLDMONT_PLUS 0x7A /* Gemini Lake */
/* Xeon Phi */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 44c4ef3d989b..10e5ccfa9278 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -949,11 +949,11 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
}
static const __initconst struct x86_cpu_id cpu_no_speculation[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SALTWELL, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SALTWELL_TABLET, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_BONNELL_MID, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SALTWELL_MID, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_BONNELL, X86_FEATURE_ANY },
{ X86_VENDOR_CENTAUR, 5 },
{ X86_VENDOR_INTEL, 5 },
{ X86_VENDOR_NSC, 5 },
@@ -968,10 +968,10 @@ static const __initconst struct x86_cpu_id cpu_no_meltdown[] = {
/* Only list CPUs which speculate but are non susceptible to SSB */
static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT2 },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MERRIFIELD },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT_X },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT_MID },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_CORE_YONAH },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM },
@@ -984,14 +984,14 @@ static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = {
static const __initconst struct x86_cpu_id cpu_no_l1tf[] = {
/* in addition to cpu_no_speculation */
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT2 },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT_X },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MERRIFIELD },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MOOREFIELD },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT_MID },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT_MID },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_DENVERTON },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GEMINI_LAKE },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT_X },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT_PLUS },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM },
{}
diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
index f8c260d522ca..912d53939f4f 100644
--- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
+++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
@@ -91,7 +91,7 @@ static u64 get_prefetch_disable_bits(void)
*/
return 0xF;
case INTEL_FAM6_ATOM_GOLDMONT:
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
/*
* SDM defines bits of MSR_MISC_FEATURE_CONTROL register
* as:
@@ -995,7 +995,7 @@ static int measure_cycles_perf_fn(void *_plr)
switch (boot_cpu_data.x86_model) {
case INTEL_FAM6_ATOM_GOLDMONT:
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
l2_hit_bits = (0x52ULL << 16) | (0x2 << 8) | 0xd1;
l2_miss_bits = (0x52ULL << 16) | (0x10 << 8) | 0xd1;
break;
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 6d5dc5dabfd7..03b7529333a6 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -636,7 +636,7 @@ unsigned long native_calibrate_tsc(void)
case INTEL_FAM6_KABYLAKE_DESKTOP:
crystal_khz = 24000; /* 24.0 MHz */
break;
- case INTEL_FAM6_ATOM_DENVERTON:
+ case INTEL_FAM6_ATOM_GOLDMONT_X:
crystal_khz = 25000; /* 25.0 MHz */
break;
case INTEL_FAM6_ATOM_GOLDMONT:
diff --git a/arch/x86/kernel/tsc_msr.c b/arch/x86/kernel/tsc_msr.c
index 27ef714d886c..3d0e9aeea7c8 100644
--- a/arch/x86/kernel/tsc_msr.c
+++ b/arch/x86/kernel/tsc_msr.c
@@ -59,12 +59,12 @@ static const struct freq_desc freq_desc_ann = {
};
static const struct x86_cpu_id tsc_msr_cpu_ids[] = {
- INTEL_CPU_FAM6(ATOM_PENWELL, freq_desc_pnw),
- INTEL_CPU_FAM6(ATOM_CLOVERVIEW, freq_desc_clv),
- INTEL_CPU_FAM6(ATOM_SILVERMONT1, freq_desc_byt),
+ INTEL_CPU_FAM6(ATOM_SALTWELL_MID, freq_desc_pnw),
+ INTEL_CPU_FAM6(ATOM_SALTWELL_TABLET, freq_desc_clv),
+ INTEL_CPU_FAM6(ATOM_SILVERMONT, freq_desc_byt),
+ INTEL_CPU_FAM6(ATOM_SILVERMONT_MID, freq_desc_tng),
INTEL_CPU_FAM6(ATOM_AIRMONT, freq_desc_cht),
- INTEL_CPU_FAM6(ATOM_MERRIFIELD, freq_desc_tng),
- INTEL_CPU_FAM6(ATOM_MOOREFIELD, freq_desc_ann),
+ INTEL_CPU_FAM6(ATOM_AIRMONT_MID, freq_desc_ann),
{}
};
diff --git a/arch/x86/platform/atom/punit_atom_debug.c b/arch/x86/platform/atom/punit_atom_debug.c
index 034813d4ab1e..41dae0f0d898 100644
--- a/arch/x86/platform/atom/punit_atom_debug.c
+++ b/arch/x86/platform/atom/punit_atom_debug.c
@@ -143,8 +143,8 @@ static void punit_dbgfs_unregister(void)
(kernel_ulong_t)&drv_data }
static const struct x86_cpu_id intel_punit_cpu_ids[] = {
- ICPU(INTEL_FAM6_ATOM_SILVERMONT1, punit_device_byt),
- ICPU(INTEL_FAM6_ATOM_MERRIFIELD, punit_device_tng),
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT, punit_device_byt),
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT_MID, punit_device_tng),
ICPU(INTEL_FAM6_ATOM_AIRMONT, punit_device_cht),
{}
};
diff --git a/arch/x86/platform/intel-mid/device_libs/platform_bt.c b/arch/x86/platform/intel-mid/device_libs/platform_bt.c
index 5a0483e7bf66..31dce781364c 100644
--- a/arch/x86/platform/intel-mid/device_libs/platform_bt.c
+++ b/arch/x86/platform/intel-mid/device_libs/platform_bt.c
@@ -68,7 +68,7 @@ static struct bt_sfi_data tng_bt_sfi_data __initdata = {
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (kernel_ulong_t)&ddata }
static const struct x86_cpu_id bt_sfi_cpu_ids[] = {
- ICPU(INTEL_FAM6_ATOM_MERRIFIELD, tng_bt_sfi_data),
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT_MID, tng_bt_sfi_data),
{}
};
diff --git a/drivers/acpi/acpi_lpss.c b/drivers/acpi/acpi_lpss.c
index 969bf8d515c0..c651e206d796 100644
--- a/drivers/acpi/acpi_lpss.c
+++ b/drivers/acpi/acpi_lpss.c
@@ -292,7 +292,7 @@ static const struct lpss_device_desc bsw_spi_dev_desc = {
#define ICPU(model) { X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, }
static const struct x86_cpu_id lpss_cpu_ids[] = {
- ICPU(INTEL_FAM6_ATOM_SILVERMONT1), /* Valleyview, Bay Trail */
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT), /* Valleyview, Bay Trail */
ICPU(INTEL_FAM6_ATOM_AIRMONT), /* Braswell, Cherry Trail */
{}
};
diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c
index 06c31ec3cc70..9a8e286dd86f 100644
--- a/drivers/acpi/x86/utils.c
+++ b/drivers/acpi/x86/utils.c
@@ -54,7 +54,7 @@ static const struct always_present_id always_present_ids[] = {
* Bay / Cherry Trail PWM directly poked by GPU driver in win10,
* but Linux uses a separate PWM driver, harmless if not used.
*/
- ENTRY("80860F09", "1", ICPU(INTEL_FAM6_ATOM_SILVERMONT1), {}),
+ ENTRY("80860F09", "1", ICPU(INTEL_FAM6_ATOM_SILVERMONT), {}),
ENTRY("80862288", "1", ICPU(INTEL_FAM6_ATOM_AIRMONT), {}),
/*
* The INT0002 device is necessary to clear wakeup interrupt sources
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index a005711f909e..29f25d5d65e0 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1779,7 +1779,7 @@ static const struct pstate_funcs knl_funcs = {
static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
ICPU(INTEL_FAM6_SANDYBRIDGE, core_funcs),
ICPU(INTEL_FAM6_SANDYBRIDGE_X, core_funcs),
- ICPU(INTEL_FAM6_ATOM_SILVERMONT1, silvermont_funcs),
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT, silvermont_funcs),
ICPU(INTEL_FAM6_IVYBRIDGE, core_funcs),
ICPU(INTEL_FAM6_HASWELL_CORE, core_funcs),
ICPU(INTEL_FAM6_BROADWELL_CORE, core_funcs),
@@ -1796,7 +1796,7 @@ static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
ICPU(INTEL_FAM6_XEON_PHI_KNL, knl_funcs),
ICPU(INTEL_FAM6_XEON_PHI_KNM, knl_funcs),
ICPU(INTEL_FAM6_ATOM_GOLDMONT, core_funcs),
- ICPU(INTEL_FAM6_ATOM_GEMINI_LAKE, core_funcs),
+ ICPU(INTEL_FAM6_ATOM_GOLDMONT_PLUS, core_funcs),
ICPU(INTEL_FAM6_SKYLAKE_X, core_funcs),
{}
};
diff --git a/drivers/edac/pnd2_edac.c b/drivers/edac/pnd2_edac.c
index df28b65358d2..903a4f1fadcc 100644
--- a/drivers/edac/pnd2_edac.c
+++ b/drivers/edac/pnd2_edac.c
@@ -1541,7 +1541,7 @@ static struct dunit_ops dnv_ops = {
static const struct x86_cpu_id pnd2_cpuids[] = {
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT, 0, (kernel_ulong_t)&apl_ops },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_DENVERTON, 0, (kernel_ulong_t)&dnv_ops },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT_X, 0, (kernel_ulong_t)&dnv_ops },
{ }
};
MODULE_DEVICE_TABLE(x86cpu, pnd2_cpuids);
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index b2ccce5fb071..c4bb67ed8da3 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1076,14 +1076,14 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
ICPU(INTEL_FAM6_WESTMERE, idle_cpu_nehalem),
ICPU(INTEL_FAM6_WESTMERE_EP, idle_cpu_nehalem),
ICPU(INTEL_FAM6_NEHALEM_EX, idle_cpu_nehalem),
- ICPU(INTEL_FAM6_ATOM_PINEVIEW, idle_cpu_atom),
- ICPU(INTEL_FAM6_ATOM_LINCROFT, idle_cpu_lincroft),
+ ICPU(INTEL_FAM6_ATOM_BONNELL, idle_cpu_atom),
+ ICPU(INTEL_FAM6_ATOM_BONNELL_MID, idle_cpu_lincroft),
ICPU(INTEL_FAM6_WESTMERE_EX, idle_cpu_nehalem),
ICPU(INTEL_FAM6_SANDYBRIDGE, idle_cpu_snb),
ICPU(INTEL_FAM6_SANDYBRIDGE_X, idle_cpu_snb),
- ICPU(INTEL_FAM6_ATOM_CEDARVIEW, idle_cpu_atom),
- ICPU(INTEL_FAM6_ATOM_SILVERMONT1, idle_cpu_byt),
- ICPU(INTEL_FAM6_ATOM_MERRIFIELD, idle_cpu_tangier),
+ ICPU(INTEL_FAM6_ATOM_SALTWELL, idle_cpu_atom),
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT, idle_cpu_byt),
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT_MID, idle_cpu_tangier),
ICPU(INTEL_FAM6_ATOM_AIRMONT, idle_cpu_cht),
ICPU(INTEL_FAM6_IVYBRIDGE, idle_cpu_ivb),
ICPU(INTEL_FAM6_IVYBRIDGE_X, idle_cpu_ivt),
@@ -1091,7 +1091,7 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
ICPU(INTEL_FAM6_HASWELL_X, idle_cpu_hsw),
ICPU(INTEL_FAM6_HASWELL_ULT, idle_cpu_hsw),
ICPU(INTEL_FAM6_HASWELL_GT3E, idle_cpu_hsw),
- ICPU(INTEL_FAM6_ATOM_SILVERMONT2, idle_cpu_avn),
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT_X, idle_cpu_avn),
ICPU(INTEL_FAM6_BROADWELL_CORE, idle_cpu_bdw),
ICPU(INTEL_FAM6_BROADWELL_GT3E, idle_cpu_bdw),
ICPU(INTEL_FAM6_BROADWELL_X, idle_cpu_bdw),
@@ -1104,8 +1104,8 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
ICPU(INTEL_FAM6_XEON_PHI_KNL, idle_cpu_knl),
ICPU(INTEL_FAM6_XEON_PHI_KNM, idle_cpu_knl),
ICPU(INTEL_FAM6_ATOM_GOLDMONT, idle_cpu_bxt),
- ICPU(INTEL_FAM6_ATOM_GEMINI_LAKE, idle_cpu_bxt),
- ICPU(INTEL_FAM6_ATOM_DENVERTON, idle_cpu_dnv),
+ ICPU(INTEL_FAM6_ATOM_GOLDMONT_PLUS, idle_cpu_bxt),
+ ICPU(INTEL_FAM6_ATOM_GOLDMONT_X, idle_cpu_dnv),
{}
};
@@ -1322,7 +1322,7 @@ static void intel_idle_state_table_update(void)
ivt_idle_state_table_update();
break;
case INTEL_FAM6_ATOM_GOLDMONT:
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
bxt_idle_state_table_update();
break;
case INTEL_FAM6_SKYLAKE_DESKTOP:
diff --git a/drivers/mmc/host/sdhci-acpi.c b/drivers/mmc/host/sdhci-acpi.c
index c61109f7b793..57c1ec322e42 100644
--- a/drivers/mmc/host/sdhci-acpi.c
+++ b/drivers/mmc/host/sdhci-acpi.c
@@ -247,7 +247,7 @@ static const struct sdhci_acpi_chip sdhci_acpi_chip_int = {
static bool sdhci_acpi_byt(void)
{
static const struct x86_cpu_id byt[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT },
{}
};
diff --git a/drivers/pci/pci-mid.c b/drivers/pci/pci-mid.c
index 314e135014dc..30fbe2ea6eab 100644
--- a/drivers/pci/pci-mid.c
+++ b/drivers/pci/pci-mid.c
@@ -62,8 +62,8 @@ static const struct pci_platform_pm_ops mid_pci_platform_pm = {
* arch/x86/platform/intel-mid/pwr.c.
*/
static const struct x86_cpu_id lpss_cpu_ids[] = {
- ICPU(INTEL_FAM6_ATOM_PENWELL),
- ICPU(INTEL_FAM6_ATOM_MERRIFIELD),
+ ICPU(INTEL_FAM6_ATOM_SALTWELL_MID),
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT_MID),
{}
};
diff --git a/drivers/platform/x86/intel_int0002_vgpio.c b/drivers/platform/x86/intel_int0002_vgpio.c
index a473dc51b18d..e89ad4964dc1 100644
--- a/drivers/platform/x86/intel_int0002_vgpio.c
+++ b/drivers/platform/x86/intel_int0002_vgpio.c
@@ -60,7 +60,7 @@ static const struct x86_cpu_id int0002_cpu_ids[] = {
/*
* Limit ourselves to Cherry Trail for now, until testing shows we
* need to handle the INT0002 device on Baytrail too.
- * ICPU(INTEL_FAM6_ATOM_SILVERMONT1), * Valleyview, Bay Trail *
+ * ICPU(INTEL_FAM6_ATOM_SILVERMONT), * Valleyview, Bay Trail *
*/
ICPU(INTEL_FAM6_ATOM_AIRMONT), /* Braswell, Cherry Trail */
{}
diff --git a/drivers/platform/x86/intel_mid_powerbtn.c b/drivers/platform/x86/intel_mid_powerbtn.c
index d79fbf924b13..5ad44204a9c3 100644
--- a/drivers/platform/x86/intel_mid_powerbtn.c
+++ b/drivers/platform/x86/intel_mid_powerbtn.c
@@ -125,8 +125,8 @@ static const struct mid_pb_ddata mrfld_ddata = {
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (kernel_ulong_t)&ddata }
static const struct x86_cpu_id mid_pb_cpu_ids[] = {
- ICPU(INTEL_FAM6_ATOM_PENWELL, mfld_ddata),
- ICPU(INTEL_FAM6_ATOM_MERRIFIELD, mrfld_ddata),
+ ICPU(INTEL_FAM6_ATOM_SALTWELL_MID, mfld_ddata),
+ ICPU(INTEL_FAM6_ATOM_SILVERMONT_MID, mrfld_ddata),
{}
};
diff --git a/drivers/platform/x86/intel_telemetry_debugfs.c b/drivers/platform/x86/intel_telemetry_debugfs.c
index 1423fa8710fd..b998d7da97fb 100644
--- a/drivers/platform/x86/intel_telemetry_debugfs.c
+++ b/drivers/platform/x86/intel_telemetry_debugfs.c
@@ -320,7 +320,7 @@ static struct telemetry_debugfs_conf telem_apl_debugfs_conf = {
static const struct x86_cpu_id telemetry_debugfs_cpu_ids[] = {
TELEM_DEBUGFS_CPU(INTEL_FAM6_ATOM_GOLDMONT, telem_apl_debugfs_conf),
- TELEM_DEBUGFS_CPU(INTEL_FAM6_ATOM_GEMINI_LAKE, telem_apl_debugfs_conf),
+ TELEM_DEBUGFS_CPU(INTEL_FAM6_ATOM_GOLDMONT_PLUS, telem_apl_debugfs_conf),
{}
};
diff --git a/drivers/platform/x86/intel_telemetry_pltdrv.c b/drivers/platform/x86/intel_telemetry_pltdrv.c
index 2f889d6c270e..fcc6bee51a42 100644
--- a/drivers/platform/x86/intel_telemetry_pltdrv.c
+++ b/drivers/platform/x86/intel_telemetry_pltdrv.c
@@ -192,7 +192,7 @@ static struct telemetry_plt_config telem_glk_config = {
static const struct x86_cpu_id telemetry_cpu_ids[] = {
TELEM_CPU(INTEL_FAM6_ATOM_GOLDMONT, telem_apl_config),
- TELEM_CPU(INTEL_FAM6_ATOM_GEMINI_LAKE, telem_glk_config),
+ TELEM_CPU(INTEL_FAM6_ATOM_GOLDMONT_PLUS, telem_glk_config),
{}
};
diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index 295d8dcba48c..8cbfcce57a06 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -1164,13 +1164,13 @@ static const struct x86_cpu_id rapl_ids[] __initconst = {
RAPL_CPU(INTEL_FAM6_KABYLAKE_DESKTOP, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_CANNONLAKE_MOBILE, rapl_defaults_core),
- RAPL_CPU(INTEL_FAM6_ATOM_SILVERMONT1, rapl_defaults_byt),
+ RAPL_CPU(INTEL_FAM6_ATOM_SILVERMONT, rapl_defaults_byt),
RAPL_CPU(INTEL_FAM6_ATOM_AIRMONT, rapl_defaults_cht),
- RAPL_CPU(INTEL_FAM6_ATOM_MERRIFIELD, rapl_defaults_tng),
- RAPL_CPU(INTEL_FAM6_ATOM_MOOREFIELD, rapl_defaults_ann),
+ RAPL_CPU(INTEL_FAM6_ATOM_SILVERMONT_MID, rapl_defaults_tng),
+ RAPL_CPU(INTEL_FAM6_ATOM_AIRMONT_MID, rapl_defaults_ann),
RAPL_CPU(INTEL_FAM6_ATOM_GOLDMONT, rapl_defaults_core),
- RAPL_CPU(INTEL_FAM6_ATOM_GEMINI_LAKE, rapl_defaults_core),
- RAPL_CPU(INTEL_FAM6_ATOM_DENVERTON, rapl_defaults_core),
+ RAPL_CPU(INTEL_FAM6_ATOM_GOLDMONT_PLUS, rapl_defaults_core),
+ RAPL_CPU(INTEL_FAM6_ATOM_GOLDMONT_X, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_XEON_PHI_KNL, rapl_defaults_hsw_server),
RAPL_CPU(INTEL_FAM6_XEON_PHI_KNM, rapl_defaults_hsw_server),
diff --git a/drivers/thermal/intel_soc_dts_thermal.c b/drivers/thermal/intel_soc_dts_thermal.c
index 1e47511a6bd5..d748527d7a38 100644
--- a/drivers/thermal/intel_soc_dts_thermal.c
+++ b/drivers/thermal/intel_soc_dts_thermal.c
@@ -45,7 +45,7 @@ static irqreturn_t soc_irq_thread_fn(int irq, void *dev_data)
}
static const struct x86_cpu_id soc_thermal_ids[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1, 0,
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT, 0,
BYT_SOC_DTS_APIC_IRQ},
{}
};
diff --git a/sound/soc/intel/boards/bytcr_rt5651.c b/sound/soc/intel/boards/bytcr_rt5651.c
index b74bbee111c6..c6c8d20be1d2 100644
--- a/sound/soc/intel/boards/bytcr_rt5651.c
+++ b/sound/soc/intel/boards/bytcr_rt5651.c
@@ -787,7 +787,7 @@ static struct snd_soc_card byt_rt5651_card = {
};
static const struct x86_cpu_id baytrail_cpu_ids[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 }, /* Valleyview */
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT }, /* Valleyview */
{}
};
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 83964f796edb..fbb53c952b73 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2082,7 +2082,7 @@ int has_turbo_ratio_group_limits(int family, int model)
switch (model) {
case INTEL_FAM6_ATOM_GOLDMONT:
case INTEL_FAM6_SKYLAKE_X:
- case INTEL_FAM6_ATOM_DENVERTON:
+ case INTEL_FAM6_ATOM_GOLDMONT_X:
return 1;
}
return 0;
@@ -3149,9 +3149,9 @@ int probe_nhm_msrs(unsigned int family, unsigned int model)
pkg_cstate_limits = skx_pkg_cstate_limits;
has_misc_feature_control = 1;
break;
- case INTEL_FAM6_ATOM_SILVERMONT1: /* BYT */
+ case INTEL_FAM6_ATOM_SILVERMONT: /* BYT */
no_MSR_MISC_PWR_MGMT = 1;
- case INTEL_FAM6_ATOM_SILVERMONT2: /* AVN */
+ case INTEL_FAM6_ATOM_SILVERMONT_X: /* AVN */
pkg_cstate_limits = slv_pkg_cstate_limits;
break;
case INTEL_FAM6_ATOM_AIRMONT: /* AMT */
@@ -3163,8 +3163,8 @@ int probe_nhm_msrs(unsigned int family, unsigned int model)
pkg_cstate_limits = phi_pkg_cstate_limits;
break;
case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
- case INTEL_FAM6_ATOM_DENVERTON: /* DNV */
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
+ case INTEL_FAM6_ATOM_GOLDMONT_X: /* DNV */
pkg_cstate_limits = bxt_pkg_cstate_limits;
break;
default:
@@ -3193,9 +3193,9 @@ int has_slv_msrs(unsigned int family, unsigned int model)
return 0;
switch (model) {
- case INTEL_FAM6_ATOM_SILVERMONT1:
- case INTEL_FAM6_ATOM_MERRIFIELD:
- case INTEL_FAM6_ATOM_MOOREFIELD:
+ case INTEL_FAM6_ATOM_SILVERMONT:
+ case INTEL_FAM6_ATOM_SILVERMONT_MID:
+ case INTEL_FAM6_ATOM_AIRMONT_MID:
return 1;
}
return 0;
@@ -3207,7 +3207,7 @@ int is_dnv(unsigned int family, unsigned int model)
return 0;
switch (model) {
- case INTEL_FAM6_ATOM_DENVERTON:
+ case INTEL_FAM6_ATOM_GOLDMONT_X:
return 1;
}
return 0;
@@ -3724,8 +3724,8 @@ double get_tdp(unsigned int model)
return ((msr >> 0) & RAPL_POWER_GRANULARITY) * rapl_power_units;
switch (model) {
- case INTEL_FAM6_ATOM_SILVERMONT1:
- case INTEL_FAM6_ATOM_SILVERMONT2:
+ case INTEL_FAM6_ATOM_SILVERMONT:
+ case INTEL_FAM6_ATOM_SILVERMONT_X:
return 30.0;
default:
return 135.0;
@@ -3791,7 +3791,7 @@ void rapl_probe(unsigned int family, unsigned int model)
}
break;
case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
do_rapl = RAPL_PKG | RAPL_PKG_POWER_INFO;
if (rapl_joules)
BIC_PRESENT(BIC_Pkg_J);
@@ -3850,8 +3850,8 @@ void rapl_probe(unsigned int family, unsigned int model)
BIC_PRESENT(BIC_RAMWatt);
}
break;
- case INTEL_FAM6_ATOM_SILVERMONT1: /* BYT */
- case INTEL_FAM6_ATOM_SILVERMONT2: /* AVN */
+ case INTEL_FAM6_ATOM_SILVERMONT: /* BYT */
+ case INTEL_FAM6_ATOM_SILVERMONT_X: /* AVN */
do_rapl = RAPL_PKG | RAPL_CORES;
if (rapl_joules) {
BIC_PRESENT(BIC_Pkg_J);
@@ -3861,7 +3861,7 @@ void rapl_probe(unsigned int family, unsigned int model)
BIC_PRESENT(BIC_CorWatt);
}
break;
- case INTEL_FAM6_ATOM_DENVERTON: /* DNV */
+ case INTEL_FAM6_ATOM_GOLDMONT_X: /* DNV */
do_rapl = RAPL_PKG | RAPL_DRAM | RAPL_DRAM_POWER_INFO | RAPL_DRAM_PERF_STATUS | RAPL_PKG_PERF_STATUS | RAPL_PKG_POWER_INFO | RAPL_CORES_ENERGY_STATUS;
BIC_PRESENT(BIC_PKG__);
BIC_PRESENT(BIC_RAM__);
@@ -3884,7 +3884,7 @@ void rapl_probe(unsigned int family, unsigned int model)
return;
rapl_power_units = 1.0 / (1 << (msr & 0xF));
- if (model == INTEL_FAM6_ATOM_SILVERMONT1)
+ if (model == INTEL_FAM6_ATOM_SILVERMONT)
rapl_energy_units = 1.0 * (1 << (msr >> 8 & 0x1F)) / 1000000;
else
rapl_energy_units = 1.0 / (1 << (msr >> 8 & 0x1F));
@@ -4141,8 +4141,8 @@ int has_snb_msrs(unsigned int family, unsigned int model)
case INTEL_FAM6_CANNONLAKE_MOBILE: /* CNL */
case INTEL_FAM6_SKYLAKE_X: /* SKX */
case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
- case INTEL_FAM6_ATOM_DENVERTON: /* DNV */
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
+ case INTEL_FAM6_ATOM_GOLDMONT_X: /* DNV */
return 1;
}
return 0;
@@ -4174,7 +4174,7 @@ int has_hsw_msrs(unsigned int family, unsigned int model)
case INTEL_FAM6_KABYLAKE_DESKTOP: /* KBL */
case INTEL_FAM6_CANNONLAKE_MOBILE: /* CNL */
case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
return 1;
}
return 0;
@@ -4209,8 +4209,8 @@ int is_slm(unsigned int family, unsigned int model)
if (!genuine_intel)
return 0;
switch (model) {
- case INTEL_FAM6_ATOM_SILVERMONT1: /* BYT */
- case INTEL_FAM6_ATOM_SILVERMONT2: /* AVN */
+ case INTEL_FAM6_ATOM_SILVERMONT: /* BYT */
+ case INTEL_FAM6_ATOM_SILVERMONT_X: /* AVN */
return 1;
}
return 0;
@@ -4581,11 +4581,11 @@ void process_cpuid()
case INTEL_FAM6_KABYLAKE_DESKTOP: /* KBL */
crystal_hz = 24000000; /* 24.0 MHz */
break;
- case INTEL_FAM6_ATOM_DENVERTON: /* DNV */
+ case INTEL_FAM6_ATOM_GOLDMONT_X: /* DNV */
crystal_hz = 25000000; /* 25.0 MHz */
break;
case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GEMINI_LAKE:
+ case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
crystal_hz = 19200000; /* 19.2 MHz */
break;
default:

View File

@ -1,48 +0,0 @@
From e75df51aa9a06da683ae47809b52e1987d2824f8 Mon Sep 17 00:00:00 2001
From: Eduardo Habkost <ehabkost@redhat.com>
Date: Wed, 5 Dec 2018 17:19:56 -0200
Subject: [PATCH 03/30] kvm: x86: Report STIBP on GET_SUPPORTED_CPUID
commit d7b09c827a6cf291f66637a36f46928dd1423184 upstream
Months ago, we have added code to allow direct access to MSR_IA32_SPEC_CTRL
to the guest, which makes STIBP available to guests. This was implemented
by commits d28b387fb74d ("KVM/VMX: Allow direct access to
MSR_IA32_SPEC_CTRL") and b2ac58f90540 ("KVM/SVM: Allow direct access to
MSR_IA32_SPEC_CTRL").
However, we never updated GET_SUPPORTED_CPUID to let userspace know that
STIBP can be enabled in CPUID. Fix that by updating
kvm_cpuid_8000_0008_ebx_x86_features and kvm_cpuid_7_0_edx_x86_features.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kvm/cpuid.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 98d13c6a64be..fe9907517fb4 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -382,7 +382,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
/* cpuid 0x80000008.ebx */
const u32 kvm_cpuid_8000_0008_ebx_x86_features =
F(AMD_IBPB) | F(AMD_IBRS) | F(AMD_SSBD) | F(VIRT_SSBD) |
- F(AMD_SSB_NO);
+ F(AMD_SSB_NO) | F(AMD_STIBP);
/* cpuid 0xC0000001.edx */
const u32 kvm_cpuid_C000_0001_edx_x86_features =
@@ -412,7 +412,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =
F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
- F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES);
+ F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP);
/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();

View File

@ -1,119 +0,0 @@
From de787f2bb763b7b5516f5c43df2e9c63be6ef279 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 21 Feb 2019 12:36:50 +0100
Subject: [PATCH 04/30] x86/msr-index: Cleanup bit defines
commit d8eabc37310a92df40d07c5a8afc53cebf996716 upstream
Greg pointed out that speculation related bit defines are using (1 << N)
format instead of BIT(N). Aside of that (1 << N) is wrong as it should use
1UL at least.
Clean it up.
[ Josh Poimboeuf: Fix tools build ]
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
arch/x86/include/asm/msr-index.h | 34 ++++++++++---------
tools/power/x86/turbostat/Makefile | 2 +-
.../power/x86/x86_energy_perf_policy/Makefile | 2 +-
3 files changed, 20 insertions(+), 18 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index f14ca0be1e3f..308b7c94df00 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -2,6 +2,8 @@
#ifndef _ASM_X86_MSR_INDEX_H
#define _ASM_X86_MSR_INDEX_H
+#include <linux/bits.h>
+
/*
* CPU model specific register (MSR) numbers.
*
@@ -40,14 +42,14 @@
/* Intel MSRs. Some also available on other CPUs */
#define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */
-#define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */
+#define SPEC_CTRL_IBRS BIT(0) /* Indirect Branch Restricted Speculation */
#define SPEC_CTRL_STIBP_SHIFT 1 /* Single Thread Indirect Branch Predictor (STIBP) bit */
-#define SPEC_CTRL_STIBP (1 << SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
+#define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
#define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
-#define SPEC_CTRL_SSBD (1 << SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
+#define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
-#define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */
+#define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */
#define MSR_PPIN_CTL 0x0000004e
#define MSR_PPIN 0x0000004f
@@ -69,20 +71,20 @@
#define MSR_MTRRcap 0x000000fe
#define MSR_IA32_ARCH_CAPABILITIES 0x0000010a
-#define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */
-#define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */
-#define ARCH_CAP_SKIP_VMENTRY_L1DFLUSH (1 << 3) /* Skip L1D flush on vmentry */
-#define ARCH_CAP_SSB_NO (1 << 4) /*
- * Not susceptible to Speculative Store Bypass
- * attack, so no Speculative Store Bypass
- * control required.
- */
+#define ARCH_CAP_RDCL_NO BIT(0) /* Not susceptible to Meltdown */
+#define ARCH_CAP_IBRS_ALL BIT(1) /* Enhanced IBRS support */
+#define ARCH_CAP_SKIP_VMENTRY_L1DFLUSH BIT(3) /* Skip L1D flush on vmentry */
+#define ARCH_CAP_SSB_NO BIT(4) /*
+ * Not susceptible to Speculative Store Bypass
+ * attack, so no Speculative Store Bypass
+ * control required.
+ */
#define MSR_IA32_FLUSH_CMD 0x0000010b
-#define L1D_FLUSH (1 << 0) /*
- * Writeback and invalidate the
- * L1 data cache.
- */
+#define L1D_FLUSH BIT(0) /*
+ * Writeback and invalidate the
+ * L1 data cache.
+ */
#define MSR_IA32_BBL_CR_CTL 0x00000119
#define MSR_IA32_BBL_CR_CTL3 0x0000011e
diff --git a/tools/power/x86/turbostat/Makefile b/tools/power/x86/turbostat/Makefile
index 2ab25aa38263..ff058bfbca3e 100644
--- a/tools/power/x86/turbostat/Makefile
+++ b/tools/power/x86/turbostat/Makefile
@@ -9,7 +9,7 @@ ifeq ("$(origin O)", "command line")
endif
turbostat : turbostat.c
-CFLAGS += -Wall
+CFLAGS += -Wall -I../../../include
CFLAGS += -DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"'
CFLAGS += -DINTEL_FAMILY_HEADER='"../../../../arch/x86/include/asm/intel-family.h"'
diff --git a/tools/power/x86/x86_energy_perf_policy/Makefile b/tools/power/x86/x86_energy_perf_policy/Makefile
index f4534fb8b951..da781b430937 100644
--- a/tools/power/x86/x86_energy_perf_policy/Makefile
+++ b/tools/power/x86/x86_energy_perf_policy/Makefile
@@ -9,7 +9,7 @@ ifeq ("$(origin O)", "command line")
endif
x86_energy_perf_policy : x86_energy_perf_policy.c
-CFLAGS += -Wall
+CFLAGS += -Wall -I../../../include
CFLAGS += -DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"'
%: %.c

View File

@ -1,169 +0,0 @@
From bb9d4a24ba55d1487a34d287c6b940ce00b85822 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 27 Feb 2019 10:10:23 +0100
Subject: [PATCH 05/30] x86/speculation: Consolidate CPU whitelists
commit 36ad35131adacc29b328b9c8b6277a8bf0d6fd5d upstream
The CPU vulnerability whitelists have some overlap and there are more
whitelists coming along.
Use the driver_data field in the x86_cpu_id struct to denote the
whitelisted vulnerabilities and combine all whitelists into one.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
arch/x86/kernel/cpu/common.c | 105 +++++++++++++++++++----------------
1 file changed, 56 insertions(+), 49 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 10e5ccfa9278..fd16b4cc991f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -948,60 +948,68 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
#endif
}
-static const __initconst struct x86_cpu_id cpu_no_speculation[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SALTWELL, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SALTWELL_TABLET, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_BONNELL_MID, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SALTWELL_MID, X86_FEATURE_ANY },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_BONNELL, X86_FEATURE_ANY },
- { X86_VENDOR_CENTAUR, 5 },
- { X86_VENDOR_INTEL, 5 },
- { X86_VENDOR_NSC, 5 },
- { X86_VENDOR_ANY, 4 },
+#define NO_SPECULATION BIT(0)
+#define NO_MELTDOWN BIT(1)
+#define NO_SSB BIT(2)
+#define NO_L1TF BIT(3)
+
+#define VULNWL(_vendor, _family, _model, _whitelist) \
+ { X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist }
+
+#define VULNWL_INTEL(model, whitelist) \
+ VULNWL(INTEL, 6, INTEL_FAM6_##model, whitelist)
+
+#define VULNWL_AMD(family, whitelist) \
+ VULNWL(AMD, family, X86_MODEL_ANY, whitelist)
+
+static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
+ VULNWL(ANY, 4, X86_MODEL_ANY, NO_SPECULATION),
+ VULNWL(CENTAUR, 5, X86_MODEL_ANY, NO_SPECULATION),
+ VULNWL(INTEL, 5, X86_MODEL_ANY, NO_SPECULATION),
+ VULNWL(NSC, 5, X86_MODEL_ANY, NO_SPECULATION),
+
+ VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION),
+ VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION),
+ VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION),
+ VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION),
+ VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION),
+
+ VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF),
+ VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF),
+ VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF),
+ VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF),
+ VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF),
+ VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF),
+
+ VULNWL_INTEL(CORE_YONAH, NO_SSB),
+
+ VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF),
+ VULNWL_INTEL(ATOM_GOLDMONT, NO_L1TF),
+ VULNWL_INTEL(ATOM_GOLDMONT_X, NO_L1TF),
+ VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_L1TF),
+
+ VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF),
+ VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF),
+ VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF),
+ VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF),
+
+ /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
+ VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF),
{}
};
-static const __initconst struct x86_cpu_id cpu_no_meltdown[] = {
- { X86_VENDOR_AMD },
- {}
-};
-
-/* Only list CPUs which speculate but are non susceptible to SSB */
-static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = {
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT_X },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT_MID },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_CORE_YONAH },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM },
- { X86_VENDOR_AMD, 0x12, },
- { X86_VENDOR_AMD, 0x11, },
- { X86_VENDOR_AMD, 0x10, },
- { X86_VENDOR_AMD, 0xf, },
- {}
-};
+static bool __init cpu_matches(unsigned long which)
+{
+ const struct x86_cpu_id *m = x86_match_cpu(cpu_vuln_whitelist);
-static const __initconst struct x86_cpu_id cpu_no_l1tf[] = {
- /* in addition to cpu_no_speculation */
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT_X },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT_MID },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT_MID },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT_X },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT_PLUS },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL },
- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM },
- {}
-};
+ return m && !!(m->driver_data & which);
+}
static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
{
u64 ia32_cap = 0;
- if (x86_match_cpu(cpu_no_speculation))
+ if (cpu_matches(NO_SPECULATION))
return;
setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
@@ -1010,15 +1018,14 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
- if (!x86_match_cpu(cpu_no_spec_store_bypass) &&
- !(ia32_cap & ARCH_CAP_SSB_NO) &&
+ if (!cpu_matches(NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) &&
!cpu_has(c, X86_FEATURE_AMD_SSB_NO))
setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
if (ia32_cap & ARCH_CAP_IBRS_ALL)
setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
- if (x86_match_cpu(cpu_no_meltdown))
+ if (cpu_matches(NO_MELTDOWN))
return;
/* Rogue Data Cache Load? No! */
@@ -1027,7 +1034,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
- if (x86_match_cpu(cpu_no_l1tf))
+ if (cpu_matches(NO_L1TF))
return;
setup_force_cpu_bug(X86_BUG_L1TF);

View File

@ -1,154 +0,0 @@
From e082f3653d17755854a3538e5658061ac92e2ab3 Mon Sep 17 00:00:00 2001
From: Andi Kleen <ak@linux.intel.com>
Date: Fri, 18 Jan 2019 16:50:16 -0800
Subject: [PATCH 06/30] x86/speculation/mds: Add basic bug infrastructure for
MDS
commit ed5194c2732c8084af9fd159c146ea92bf137128 upstream
Microarchitectural Data Sampling (MDS), is a class of side channel attacks
on internal buffers in Intel CPUs. The variants are:
- Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126)
- Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130)
- Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)
MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a
dependent load (store-to-load forwarding) as an optimization. The forward
can also happen to a faulting or assisting load operation for a different
memory address, which can be exploited under certain conditions. Store
buffers are partitioned between Hyper-Threads so cross thread forwarding is
not possible. But if a thread enters or exits a sleep state the store
buffer is repartitioned which can expose data from one thread to the other.
MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage
L1 miss situations and to hold data which is returned or sent in response
to a memory or I/O operation. Fill buffers can forward data to a load
operation and also write data to the cache. When the fill buffer is
deallocated it can retain the stale data of the preceding operations which
can then be forwarded to a faulting or assisting load operation, which can
be exploited under certain conditions. Fill buffers are shared between
Hyper-Threads so cross thread leakage is possible.
MLDPS leaks Load Port Data. Load ports are used to perform load operations
from memory or I/O. The received data is then forwarded to the register
file or a subsequent operation. In some implementations the Load Port can
contain stale data from a previous operation which can be forwarded to
faulting or assisting loads under certain conditions, which again can be
exploited eventually. Load ports are shared between Hyper-Threads so cross
thread leakage is possible.
All variants have the same mitigation for single CPU thread case (SMT off),
so the kernel can treat them as one MDS issue.
Add the basic infrastructure to detect if the current CPU is affected by
MDS.
[ tglx: Rewrote changelog ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
arch/x86/include/asm/cpufeatures.h | 2 ++
arch/x86/include/asm/msr-index.h | 5 +++++
arch/x86/kernel/cpu/common.c | 23 +++++++++++++++--------
3 files changed, 22 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 7b31ee5223fc..1dc7b8129b55 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -341,6 +341,7 @@
#define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
#define X86_FEATURE_TSX_FORCE_ABORT (18*32+13) /* "" TSX_FORCE_ABORT */
+#define X86_FEATURE_MD_CLEAR (18*32+10) /* VERW clears CPU buffers */
#define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */
#define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */
#define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */
@@ -378,5 +379,6 @@
#define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */
#define X86_BUG_SPEC_STORE_BYPASS X86_BUG(17) /* CPU is affected by speculative store bypass attack */
#define X86_BUG_L1TF X86_BUG(18) /* CPU is affected by L1 Terminal Fault */
+#define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
#endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 308b7c94df00..f85f43db9225 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -79,6 +79,11 @@
* attack, so no Speculative Store Bypass
* control required.
*/
+#define ARCH_CAP_MDS_NO BIT(5) /*
+ * Not susceptible to
+ * Microarchitectural Data
+ * Sampling (MDS) vulnerabilities.
+ */
#define MSR_IA32_FLUSH_CMD 0x0000010b
#define L1D_FLUSH BIT(0) /*
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index fd16b4cc991f..0ea1e4bc3e20 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -952,6 +952,7 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
#define NO_MELTDOWN BIT(1)
#define NO_SSB BIT(2)
#define NO_L1TF BIT(3)
+#define NO_MDS BIT(4)
#define VULNWL(_vendor, _family, _model, _whitelist) \
{ X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist }
@@ -968,6 +969,7 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
VULNWL(INTEL, 5, X86_MODEL_ANY, NO_SPECULATION),
VULNWL(NSC, 5, X86_MODEL_ANY, NO_SPECULATION),
+ /* Intel Family 6 */
VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION),
VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION),
VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION),
@@ -984,17 +986,19 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
VULNWL_INTEL(CORE_YONAH, NO_SSB),
VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF),
- VULNWL_INTEL(ATOM_GOLDMONT, NO_L1TF),
- VULNWL_INTEL(ATOM_GOLDMONT_X, NO_L1TF),
- VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_L1TF),
- VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF),
- VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF),
- VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF),
- VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF),
+ VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF),
+ VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF),
+ VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF),
+
+ /* AMD Family 0xf - 0x12 */
+ VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+ VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+ VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+ VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
/* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
- VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF),
+ VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS),
{}
};
@@ -1025,6 +1029,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
if (ia32_cap & ARCH_CAP_IBRS_ALL)
setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
+ if (!cpu_matches(NO_MDS) && !(ia32_cap & ARCH_CAP_MDS_NO))
+ setup_force_cpu_bug(X86_BUG_MDS);
+
if (cpu_matches(NO_MELTDOWN))
return;

View File

@ -1,90 +0,0 @@
From 91439bd017c726a81577dd2bee789580f5bfdf35 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 1 Mar 2019 20:21:08 +0100
Subject: [PATCH 07/30] x86/speculation/mds: Add BUG_MSBDS_ONLY
commit e261f209c3666e842fd645a1e31f001c3a26def9 upstream
This bug bit is set on CPUs which are only affected by Microarchitectural
Store Buffer Data Sampling (MSBDS) and not by any other MDS variant.
This is important because the Store Buffers are partitioned between
Hyper-Threads so cross thread forwarding is not possible. But if a thread
enters or exits a sleep state the store buffer is repartitioned which can
expose data from one thread to the other. This transition can be mitigated.
That means that for CPUs which are only affected by MSBDS SMT can be
enabled, if the CPU is not affected by other SMT sensitive vulnerabilities,
e.g. L1TF. The XEON PHI variants fall into that category. Also the
Silvermont/Airmont ATOMs, but for them it's not really relevant as they do
not support SMT, but mark them for completeness sake.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/common.c | 20 ++++++++++++--------
2 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 1dc7b8129b55..69037da75ea0 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -380,5 +380,6 @@
#define X86_BUG_SPEC_STORE_BYPASS X86_BUG(17) /* CPU is affected by speculative store bypass attack */
#define X86_BUG_L1TF X86_BUG(18) /* CPU is affected by L1 Terminal Fault */
#define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
+#define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */
#endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0ea1e4bc3e20..1073118b9bf0 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -953,6 +953,7 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
#define NO_SSB BIT(2)
#define NO_L1TF BIT(3)
#define NO_MDS BIT(4)
+#define MSBDS_ONLY BIT(5)
#define VULNWL(_vendor, _family, _model, _whitelist) \
{ X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist }
@@ -976,16 +977,16 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION),
VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION),
- VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF),
- VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF),
- VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF),
- VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF),
- VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF),
- VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF),
+ VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY),
VULNWL_INTEL(CORE_YONAH, NO_SSB),
- VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF),
+ VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY),
VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF),
VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF),
@@ -1029,8 +1030,11 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
if (ia32_cap & ARCH_CAP_IBRS_ALL)
setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
- if (!cpu_matches(NO_MDS) && !(ia32_cap & ARCH_CAP_MDS_NO))
+ if (!cpu_matches(NO_MDS) && !(ia32_cap & ARCH_CAP_MDS_NO)) {
setup_force_cpu_bug(X86_BUG_MDS);
+ if (cpu_matches(MSBDS_ONLY))
+ setup_force_cpu_bug(X86_BUG_MSBDS_ONLY);
+ }
if (cpu_matches(NO_MELTDOWN))
return;

View File

@ -1,43 +0,0 @@
From 10c46ffb2f76b9a73f070877b3e83b9096bf0ed8 Mon Sep 17 00:00:00 2001
From: Andi Kleen <ak@linux.intel.com>
Date: Fri, 18 Jan 2019 16:50:23 -0800
Subject: [PATCH 08/30] x86/kvm: Expose X86_FEATURE_MD_CLEAR to guests
commit 6c4dbbd14730c43f4ed808a9c42ca41625925c22 upstream
X86_FEATURE_MD_CLEAR is a new CPUID bit which is set when microcode
provides the mechanism to invoke a flush of various exploitable CPU buffers
by invoking the VERW instruction.
Hand it through to guests so they can adjust their mitigations.
This also requires corresponding qemu changes, which are available
separately.
[ tglx: Massaged changelog ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
arch/x86/kvm/cpuid.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index fe9907517fb4..b810102a9cfa 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -412,7 +412,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =
F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
- F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP);
+ F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
+ F(MD_CLEAR);
/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();

View File

@ -1,231 +0,0 @@
From 370fbb129df726c669be7f89403d7b2053f035bc Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 18 Feb 2019 23:13:06 +0100
Subject: [PATCH 09/30] x86/speculation/mds: Add mds_clear_cpu_buffers()
commit 6a9e529272517755904b7afa639f6db59ddb793e upstream
The Microarchitectural Data Sampling (MDS) vulernabilities are mitigated by
clearing the affected CPU buffers. The mechanism for clearing the buffers
uses the unused and obsolete VERW instruction in combination with a
microcode update which triggers a CPU buffer clear when VERW is executed.
Provide a inline function with the assembly magic. The argument of the VERW
instruction must be a memory operand as documented:
"MD_CLEAR enumerates that the memory-operand variant of VERW (for
example, VERW m16) has been extended to also overwrite buffers affected
by MDS. This buffer overwriting functionality is not guaranteed for the
register operand variant of VERW."
Documentation also recommends to use a writable data segment selector:
"The buffer overwriting occurs regardless of the result of the VERW
permission check, as well as when the selector is null or causes a
descriptor load segment violation. However, for lowest latency we
recommend using a selector that indicates a valid writable data
segment."
Add x86 specific documentation about MDS and the internal workings of the
mitigation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
Documentation/index.rst | 1 +
Documentation/x86/conf.py | 10 +++
Documentation/x86/index.rst | 8 +++
Documentation/x86/mds.rst | 99 ++++++++++++++++++++++++++++
arch/x86/include/asm/nospec-branch.h | 25 +++++++
5 files changed, 143 insertions(+)
create mode 100644 Documentation/x86/conf.py
create mode 100644 Documentation/x86/index.rst
create mode 100644 Documentation/x86/mds.rst
diff --git a/Documentation/index.rst b/Documentation/index.rst
index 5db7e87c7cb1..1cdc139adb40 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -104,6 +104,7 @@ implementation.
:maxdepth: 2
sh/index
+ x86/index
Filesystem Documentation
------------------------
diff --git a/Documentation/x86/conf.py b/Documentation/x86/conf.py
new file mode 100644
index 000000000000..33c5c3142e20
--- /dev/null
+++ b/Documentation/x86/conf.py
@@ -0,0 +1,10 @@
+# -*- coding: utf-8; mode: python -*-
+
+project = "X86 architecture specific documentation"
+
+tags.add("subproject")
+
+latex_documents = [
+ ('index', 'x86.tex', project,
+ 'The kernel development community', 'manual'),
+]
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
new file mode 100644
index 000000000000..ef389dcf1b1d
--- /dev/null
+++ b/Documentation/x86/index.rst
@@ -0,0 +1,8 @@
+==========================
+x86 architecture specifics
+==========================
+
+.. toctree::
+ :maxdepth: 1
+
+ mds
diff --git a/Documentation/x86/mds.rst b/Documentation/x86/mds.rst
new file mode 100644
index 000000000000..1096738d50f2
--- /dev/null
+++ b/Documentation/x86/mds.rst
@@ -0,0 +1,99 @@
+Microarchitectural Data Sampling (MDS) mitigation
+=================================================
+
+.. _mds:
+
+Overview
+--------
+
+Microarchitectural Data Sampling (MDS) is a family of side channel attacks
+on internal buffers in Intel CPUs. The variants are:
+
+ - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126)
+ - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130)
+ - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)
+
+MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a
+dependent load (store-to-load forwarding) as an optimization. The forward
+can also happen to a faulting or assisting load operation for a different
+memory address, which can be exploited under certain conditions. Store
+buffers are partitioned between Hyper-Threads so cross thread forwarding is
+not possible. But if a thread enters or exits a sleep state the store
+buffer is repartitioned which can expose data from one thread to the other.
+
+MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage
+L1 miss situations and to hold data which is returned or sent in response
+to a memory or I/O operation. Fill buffers can forward data to a load
+operation and also write data to the cache. When the fill buffer is
+deallocated it can retain the stale data of the preceding operations which
+can then be forwarded to a faulting or assisting load operation, which can
+be exploited under certain conditions. Fill buffers are shared between
+Hyper-Threads so cross thread leakage is possible.
+
+MLPDS leaks Load Port Data. Load ports are used to perform load operations
+from memory or I/O. The received data is then forwarded to the register
+file or a subsequent operation. In some implementations the Load Port can
+contain stale data from a previous operation which can be forwarded to
+faulting or assisting loads under certain conditions, which again can be
+exploited eventually. Load ports are shared between Hyper-Threads so cross
+thread leakage is possible.
+
+
+Exposure assumptions
+--------------------
+
+It is assumed that attack code resides in user space or in a guest with one
+exception. The rationale behind this assumption is that the code construct
+needed for exploiting MDS requires:
+
+ - to control the load to trigger a fault or assist
+
+ - to have a disclosure gadget which exposes the speculatively accessed
+ data for consumption through a side channel.
+
+ - to control the pointer through which the disclosure gadget exposes the
+ data
+
+The existence of such a construct in the kernel cannot be excluded with
+100% certainty, but the complexity involved makes it extremly unlikely.
+
+There is one exception, which is untrusted BPF. The functionality of
+untrusted BPF is limited, but it needs to be thoroughly investigated
+whether it can be used to create such a construct.
+
+
+Mitigation strategy
+-------------------
+
+All variants have the same mitigation strategy at least for the single CPU
+thread case (SMT off): Force the CPU to clear the affected buffers.
+
+This is achieved by using the otherwise unused and obsolete VERW
+instruction in combination with a microcode update. The microcode clears
+the affected CPU buffers when the VERW instruction is executed.
+
+For virtualization there are two ways to achieve CPU buffer
+clearing. Either the modified VERW instruction or via the L1D Flush
+command. The latter is issued when L1TF mitigation is enabled so the extra
+VERW can be avoided. If the CPU is not affected by L1TF then VERW needs to
+be issued.
+
+If the VERW instruction with the supplied segment selector argument is
+executed on a CPU without the microcode update there is no side effect
+other than a small number of pointlessly wasted CPU cycles.
+
+This does not protect against cross Hyper-Thread attacks except for MSBDS
+which is only exploitable cross Hyper-thread when one of the Hyper-Threads
+enters a C-state.
+
+The kernel provides a function to invoke the buffer clearing:
+
+ mds_clear_cpu_buffers()
+
+The mitigation is invoked on kernel/userspace, hypervisor/guest and C-state
+(idle) transitions.
+
+According to current knowledge additional mitigations inside the kernel
+itself are not required because the necessary gadgets to expose the leaked
+data cannot be controlled in a way which allows exploitation from malicious
+user space or VM guests.
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 032b6009baab..c022732e2cf9 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -317,6 +317,31 @@ DECLARE_STATIC_KEY_FALSE(switch_to_cond_stibp);
DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
+#include <asm/segment.h>
+
+/**
+ * mds_clear_cpu_buffers - Mitigation for MDS vulnerability
+ *
+ * This uses the otherwise unused and obsolete VERW instruction in
+ * combination with microcode which triggers a CPU buffer flush when the
+ * instruction is executed.
+ */
+static inline void mds_clear_cpu_buffers(void)
+{
+ static const u16 ds = __KERNEL_DS;
+
+ /*
+ * Has to be the memory-operand variant because only that
+ * guarantees the CPU buffer flush functionality according to
+ * documentation. The register-operand variant does not.
+ * Works with any segment selector, but a valid writable
+ * data segment is the fastest variant.
+ *
+ * "cc" clobber is required because VERW modifies ZF.
+ */
+ asm volatile("verw %[ds]" : : [ds] "m" (ds) : "cc");
+}
+
#endif /* __ASSEMBLY__ */
/*

View File

@ -1,203 +0,0 @@
From d526a5f8b2e39468b2c2e88ae2ff01b4fb86a945 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 18 Feb 2019 23:42:51 +0100
Subject: [PATCH 10/30] x86/speculation/mds: Clear CPU buffers on exit to user
commit 04dcbdb8057827b043b3c71aa397c4c63e67d086 upstream
Add a static key which controls the invocation of the CPU buffer clear
mechanism on exit to user space and add the call into
prepare_exit_to_usermode() and do_nmi() right before actually returning.
Add documentation which kernel to user space transition this covers and
explain why some corner cases are not mitigated.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
Documentation/x86/mds.rst | 52 ++++++++++++++++++++++++++++
arch/x86/entry/common.c | 3 ++
arch/x86/include/asm/nospec-branch.h | 13 +++++++
arch/x86/kernel/cpu/bugs.c | 3 ++
arch/x86/kernel/nmi.c | 4 +++
arch/x86/kernel/traps.c | 8 +++++
6 files changed, 83 insertions(+)
diff --git a/Documentation/x86/mds.rst b/Documentation/x86/mds.rst
index 1096738d50f2..54d935bf283b 100644
--- a/Documentation/x86/mds.rst
+++ b/Documentation/x86/mds.rst
@@ -97,3 +97,55 @@ According to current knowledge additional mitigations inside the kernel
itself are not required because the necessary gadgets to expose the leaked
data cannot be controlled in a way which allows exploitation from malicious
user space or VM guests.
+
+Mitigation points
+-----------------
+
+1. Return to user space
+^^^^^^^^^^^^^^^^^^^^^^^
+
+ When transitioning from kernel to user space the CPU buffers are flushed
+ on affected CPUs when the mitigation is not disabled on the kernel
+ command line. The migitation is enabled through the static key
+ mds_user_clear.
+
+ The mitigation is invoked in prepare_exit_to_usermode() which covers
+ most of the kernel to user space transitions. There are a few exceptions
+ which are not invoking prepare_exit_to_usermode() on return to user
+ space. These exceptions use the paranoid exit code.
+
+ - Non Maskable Interrupt (NMI):
+
+ Access to sensible data like keys, credentials in the NMI context is
+ mostly theoretical: The CPU can do prefetching or execute a
+ misspeculated code path and thereby fetching data which might end up
+ leaking through a buffer.
+
+ But for mounting other attacks the kernel stack address of the task is
+ already valuable information. So in full mitigation mode, the NMI is
+ mitigated on the return from do_nmi() to provide almost complete
+ coverage.
+
+ - Double fault (#DF):
+
+ A double fault is usually fatal, but the ESPFIX workaround, which can
+ be triggered from user space through modify_ldt(2) is a recoverable
+ double fault. #DF uses the paranoid exit path, so explicit mitigation
+ in the double fault handler is required.
+
+ - Machine Check Exception (#MC):
+
+ Another corner case is a #MC which hits between the CPU buffer clear
+ invocation and the actual return to user. As this still is in kernel
+ space it takes the paranoid exit path which does not clear the CPU
+ buffers. So the #MC handler repopulates the buffers to some
+ extent. Machine checks are not reliably controllable and the window is
+ extremly small so mitigation would just tick a checkbox that this
+ theoretical corner case is covered. To keep the amount of special
+ cases small, ignore #MC.
+
+ - Debug Exception (#DB):
+
+ This takes the paranoid exit path only when the INT1 breakpoint is in
+ kernel space. #DB on a user space address takes the regular exit path,
+ so no extra mitigation required.
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 3b2490b81918..8353348ddeaf 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -31,6 +31,7 @@
#include <asm/vdso.h>
#include <linux/uaccess.h>
#include <asm/cpufeature.h>
+#include <asm/nospec-branch.h>
#define CREATE_TRACE_POINTS
#include <trace/events/syscalls.h>
@@ -212,6 +213,8 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
#endif
user_enter_irqoff();
+
+ mds_user_clear_cpu_buffers();
}
#define SYSCALL_EXIT_WORK_FLAGS \
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index c022732e2cf9..912d509d34fc 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -317,6 +317,8 @@ DECLARE_STATIC_KEY_FALSE(switch_to_cond_stibp);
DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
+DECLARE_STATIC_KEY_FALSE(mds_user_clear);
+
#include <asm/segment.h>
/**
@@ -342,6 +344,17 @@ static inline void mds_clear_cpu_buffers(void)
asm volatile("verw %[ds]" : : [ds] "m" (ds) : "cc");
}
+/**
+ * mds_user_clear_cpu_buffers - Mitigation for MDS vulnerability
+ *
+ * Clear CPU buffers if the corresponding static key is enabled
+ */
+static inline void mds_user_clear_cpu_buffers(void)
+{
+ if (static_branch_likely(&mds_user_clear))
+ mds_clear_cpu_buffers();
+}
+
#endif /* __ASSEMBLY__ */
/*
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index e5258bd64200..2a69046cc38c 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -61,6 +61,9 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
/* Control unconditional IBPB in switch_mm() */
DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
+/* Control MDS CPU buffer clear before returning to user space */
+DEFINE_STATIC_KEY_FALSE(mds_user_clear);
+
void __init check_bugs(void)
{
identify_boot_cpu();
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 18bc9b51ac9b..086cf1d1d71d 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -34,6 +34,7 @@
#include <asm/x86_init.h>
#include <asm/reboot.h>
#include <asm/cache.h>
+#include <asm/nospec-branch.h>
#define CREATE_TRACE_POINTS
#include <trace/events/nmi.h>
@@ -533,6 +534,9 @@ do_nmi(struct pt_regs *regs, long error_code)
write_cr2(this_cpu_read(nmi_cr2));
if (this_cpu_dec_return(nmi_state))
goto nmi_restart;
+
+ if (user_mode(regs))
+ mds_user_clear_cpu_buffers();
}
NOKPROBE_SYMBOL(do_nmi);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index e6db475164ed..0a5efd764914 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -58,6 +58,7 @@
#include <asm/alternative.h>
#include <asm/fpu/xstate.h>
#include <asm/trace/mpx.h>
+#include <asm/nospec-branch.h>
#include <asm/mpx.h>
#include <asm/vm86.h>
#include <asm/umip.h>
@@ -387,6 +388,13 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
regs->ip = (unsigned long)general_protection;
regs->sp = (unsigned long)&gpregs->orig_ax;
+ /*
+ * This situation can be triggered by userspace via
+ * modify_ldt(2) and the return does not take the regular
+ * user space exit, so a CPU buffer clear is required when
+ * MDS mitigation is enabled.
+ */
+ mds_user_clear_cpu_buffers();
return;
}
#endif

View File

@ -1,56 +0,0 @@
From d04d9bffb07223cb687be8f5fbb059e6fa84b25a Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 27 Feb 2019 12:48:14 +0100
Subject: [PATCH 11/30] x86/kvm/vmx: Add MDS protection when L1D Flush is not
active
commit 650b68a0622f933444a6d66936abb3103029413b upstream
CPUs which are affected by L1TF and MDS mitigate MDS with the L1D Flush on
VMENTER when updated microcode is installed.
If a CPU is not affected by L1TF or if the L1D Flush is not in use, then
MDS mitigation needs to be invoked explicitly.
For these cases, follow the host mitigation state and invoke the MDS
mitigation before VMENTER.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
arch/x86/kernel/cpu/bugs.c | 1 +
arch/x86/kvm/vmx.c | 3 +++
2 files changed, 4 insertions(+)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 2a69046cc38c..c01468ccefc1 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -63,6 +63,7 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
/* Control MDS CPU buffer clear before returning to user space */
DEFINE_STATIC_KEY_FALSE(mds_user_clear);
+EXPORT_SYMBOL_GPL(mds_user_clear);
void __init check_bugs(void)
{
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 215339c7d161..e9bf477209dc 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -10765,8 +10765,11 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
evmcs_rsp = static_branch_unlikely(&enable_evmcs) ?
(unsigned long)&current_evmcs->host_rsp : 0;
+ /* L1D Flush includes CPU buffer clear to mitigate MDS */
if (static_branch_unlikely(&vmx_l1d_should_flush))
vmx_l1d_flush(vcpu);
+ else if (static_branch_unlikely(&mds_user_clear))
+ mds_clear_cpu_buffers();
asm(
/* Store host registers */

View File

@ -1,223 +0,0 @@
From e7505a450c34e89009ba48c459c08397ee3fc227 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 18 Feb 2019 23:04:01 +0100
Subject: [PATCH 12/30] x86/speculation/mds: Conditionally clear CPU buffers on
idle entry
commit 07f07f55a29cb705e221eda7894dd67ab81ef343 upstream
Add a static key which controls the invocation of the CPU buffer clear
mechanism on idle entry. This is independent of other MDS mitigations
because the idle entry invocation to mitigate the potential leakage due to
store buffer repartitioning is only necessary on SMT systems.
Add the actual invocations to the different halt/mwait variants which
covers all usage sites. mwaitx is not patched as it's not available on
Intel CPUs.
The buffer clear is only invoked before entering the C-State to prevent
that stale data from the idling CPU is spilled to the Hyper-Thread sibling
after the Store buffer got repartitioned and all entries are available to
the non idle sibling.
When coming out of idle the store buffer is partitioned again so each
sibling has half of it available. Now CPU which returned from idle could be
speculatively exposed to contents of the sibling, but the buffers are
flushed either on exit to user space or on VMENTER.
When later on conditional buffer clearing is implemented on top of this,
then there is no action required either because before returning to user
space the context switch will set the condition flag which causes a flush
on the return to user path.
Note, that the buffer clearing on idle is only sensible on CPUs which are
solely affected by MSBDS and not any other variant of MDS because the other
MDS variants cannot be mitigated when SMT is enabled, so the buffer
clearing on idle would be a window dressing exercise.
This intentionally does not handle the case in the acpi/processor_idle
driver which uses the legacy IO port interface for C-State transitions for
two reasons:
- The acpi/processor_idle driver was replaced by the intel_idle driver
almost a decade ago. Anything Nehalem upwards supports it and defaults
to that new driver.
- The legacy IO port interface is likely to be used on older and therefore
unaffected CPUs or on systems which do not receive microcode updates
anymore, so there is no point in adding that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
Documentation/x86/mds.rst | 42 ++++++++++++++++++++++++++++
arch/x86/include/asm/irqflags.h | 4 +++
arch/x86/include/asm/mwait.h | 7 +++++
arch/x86/include/asm/nospec-branch.h | 12 ++++++++
arch/x86/kernel/cpu/bugs.c | 3 ++
5 files changed, 68 insertions(+)
diff --git a/Documentation/x86/mds.rst b/Documentation/x86/mds.rst
index 54d935bf283b..87ce8ac9f36e 100644
--- a/Documentation/x86/mds.rst
+++ b/Documentation/x86/mds.rst
@@ -149,3 +149,45 @@ Mitigation points
This takes the paranoid exit path only when the INT1 breakpoint is in
kernel space. #DB on a user space address takes the regular exit path,
so no extra mitigation required.
+
+
+2. C-State transition
+^^^^^^^^^^^^^^^^^^^^^
+
+ When a CPU goes idle and enters a C-State the CPU buffers need to be
+ cleared on affected CPUs when SMT is active. This addresses the
+ repartitioning of the store buffer when one of the Hyper-Threads enters
+ a C-State.
+
+ When SMT is inactive, i.e. either the CPU does not support it or all
+ sibling threads are offline CPU buffer clearing is not required.
+
+ The idle clearing is enabled on CPUs which are only affected by MSBDS
+ and not by any other MDS variant. The other MDS variants cannot be
+ protected against cross Hyper-Thread attacks because the Fill Buffer and
+ the Load Ports are shared. So on CPUs affected by other variants, the
+ idle clearing would be a window dressing exercise and is therefore not
+ activated.
+
+ The invocation is controlled by the static key mds_idle_clear which is
+ switched depending on the chosen mitigation mode and the SMT state of
+ the system.
+
+ The buffer clear is only invoked before entering the C-State to prevent
+ that stale data from the idling CPU from spilling to the Hyper-Thread
+ sibling after the store buffer got repartitioned and all entries are
+ available to the non idle sibling.
+
+ When coming out of idle the store buffer is partitioned again so each
+ sibling has half of it available. The back from idle CPU could be then
+ speculatively exposed to contents of the sibling. The buffers are
+ flushed either on exit to user space or on VMENTER so malicious code
+ in user space or the guest cannot speculatively access them.
+
+ The mitigation is hooked into all variants of halt()/mwait(), but does
+ not cover the legacy ACPI IO-Port mechanism because the ACPI idle driver
+ has been superseded by the intel_idle driver around 2010 and is
+ preferred on all affected CPUs which are expected to gain the MD_CLEAR
+ functionality in microcode. Aside of that the IO-Port mechanism is a
+ legacy interface which is only used on older systems which are either
+ not affected or do not receive microcode updates anymore.
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 15450a675031..c99c66b41e53 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -6,6 +6,8 @@
#ifndef __ASSEMBLY__
+#include <asm/nospec-branch.h>
+
/* Provide __cpuidle; we can't safely include <linux/cpu.h> */
#define __cpuidle __attribute__((__section__(".cpuidle.text")))
@@ -54,11 +56,13 @@ static inline void native_irq_enable(void)
static inline __cpuidle void native_safe_halt(void)
{
+ mds_idle_clear_cpu_buffers();
asm volatile("sti; hlt": : :"memory");
}
static inline __cpuidle void native_halt(void)
{
+ mds_idle_clear_cpu_buffers();
asm volatile("hlt": : :"memory");
}
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 39a2fb29378a..eb0f80ce8524 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -6,6 +6,7 @@
#include <linux/sched/idle.h>
#include <asm/cpufeature.h>
+#include <asm/nospec-branch.h>
#define MWAIT_SUBSTATE_MASK 0xf
#define MWAIT_CSTATE_MASK 0xf
@@ -40,6 +41,8 @@ static inline void __monitorx(const void *eax, unsigned long ecx,
static inline void __mwait(unsigned long eax, unsigned long ecx)
{
+ mds_idle_clear_cpu_buffers();
+
/* "mwait %eax, %ecx;" */
asm volatile(".byte 0x0f, 0x01, 0xc9;"
:: "a" (eax), "c" (ecx));
@@ -74,6 +77,8 @@ static inline void __mwait(unsigned long eax, unsigned long ecx)
static inline void __mwaitx(unsigned long eax, unsigned long ebx,
unsigned long ecx)
{
+ /* No MDS buffer clear as this is AMD/HYGON only */
+
/* "mwaitx %eax, %ebx, %ecx;" */
asm volatile(".byte 0x0f, 0x01, 0xfb;"
:: "a" (eax), "b" (ebx), "c" (ecx));
@@ -81,6 +86,8 @@ static inline void __mwaitx(unsigned long eax, unsigned long ebx,
static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
{
+ mds_idle_clear_cpu_buffers();
+
trace_hardirqs_on();
/* "mwait %eax, %ecx;" */
asm volatile("sti; .byte 0x0f, 0x01, 0xc9;"
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 912d509d34fc..599c273f5d00 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -318,6 +318,7 @@ DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
DECLARE_STATIC_KEY_FALSE(mds_user_clear);
+DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
#include <asm/segment.h>
@@ -355,6 +356,17 @@ static inline void mds_user_clear_cpu_buffers(void)
mds_clear_cpu_buffers();
}
+/**
+ * mds_idle_clear_cpu_buffers - Mitigation for MDS vulnerability
+ *
+ * Clear CPU buffers if the corresponding static key is enabled
+ */
+static inline void mds_idle_clear_cpu_buffers(void)
+{
+ if (static_branch_likely(&mds_idle_clear))
+ mds_clear_cpu_buffers();
+}
+
#endif /* __ASSEMBLY__ */
/*
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index c01468ccefc1..428fe6590360 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -64,6 +64,9 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
/* Control MDS CPU buffer clear before returning to user space */
DEFINE_STATIC_KEY_FALSE(mds_user_clear);
EXPORT_SYMBOL_GPL(mds_user_clear);
+/* Control MDS CPU buffer clear before idling (halt, mwait) */
+DEFINE_STATIC_KEY_FALSE(mds_idle_clear);
+EXPORT_SYMBOL_GPL(mds_idle_clear);
void __init check_bugs(void)
{

View File

@ -1,191 +0,0 @@
From 892ec2b2472857d93fb2b3d125a13d9df4400ce0 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 18 Feb 2019 22:04:08 +0100
Subject: [PATCH 13/30] x86/speculation/mds: Add mitigation control for MDS
commit bc1241700acd82ec69fde98c5763ce51086269f8 upstream
Now that the mitigations are in place, add a command line parameter to
control the mitigation, a mitigation selector function and a SMT update
mechanism.
This is the minimal straight forward initial implementation which just
provides an always on/off mode. The command line parameter is:
mds=[full|off]
This is consistent with the existing mitigations for other speculative
hardware vulnerabilities.
The idle invocation is dynamically updated according to the SMT state of
the system similar to the dynamic update of the STIBP mitigation. The idle
mitigation is limited to CPUs which are only affected by MSBDS and not any
other variant, because the other variants cannot be mitigated on SMT
enabled systems.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
.../admin-guide/kernel-parameters.txt | 22 ++++++
arch/x86/include/asm/processor.h | 5 ++
arch/x86/kernel/cpu/bugs.c | 70 +++++++++++++++++++
3 files changed, 97 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 8b6567f7cb9b..a0ab4521d7c5 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2319,6 +2319,28 @@
Format: <first>,<last>
Specifies range of consoles to be captured by the MDA.
+ mds= [X86,INTEL]
+ Control mitigation for the Micro-architectural Data
+ Sampling (MDS) vulnerability.
+
+ Certain CPUs are vulnerable to an exploit against CPU
+ internal buffers which can forward information to a
+ disclosure gadget under certain conditions.
+
+ In vulnerable processors, the speculatively
+ forwarded data can be used in a cache side channel
+ attack, to access data to which the attacker does
+ not have direct access.
+
+ This parameter controls the MDS mitigation. The
+ options are:
+
+ full - Enable MDS mitigation on vulnerable CPUs
+ off - Unconditionally disable MDS mitigation
+
+ Not specifying this option is equivalent to
+ mds=full.
+
mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
Amount of memory to be used when the kernel is not able
to see the whole system memory or for test.
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index d53c54b842da..5e9f953face0 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -997,4 +997,9 @@ enum l1tf_mitigations {
extern enum l1tf_mitigations l1tf_mitigation;
+enum mds_mitigations {
+ MDS_MITIGATION_OFF,
+ MDS_MITIGATION_FULL,
+};
+
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 428fe6590360..413a672f03a3 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -35,6 +35,7 @@
static void __init spectre_v2_select_mitigation(void);
static void __init ssb_select_mitigation(void);
static void __init l1tf_select_mitigation(void);
+static void __init mds_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR that always has to be preserved. */
u64 x86_spec_ctrl_base;
@@ -106,6 +107,8 @@ void __init check_bugs(void)
l1tf_select_mitigation();
+ mds_select_mitigation();
+
#ifdef CONFIG_X86_32
/*
* Check whether we are able to run this kernel safely on SMP.
@@ -211,6 +214,50 @@ static void x86_amd_ssb_disable(void)
wrmsrl(MSR_AMD64_LS_CFG, msrval);
}
+#undef pr_fmt
+#define pr_fmt(fmt) "MDS: " fmt
+
+/* Default mitigation for L1TF-affected CPUs */
+static enum mds_mitigations mds_mitigation __ro_after_init = MDS_MITIGATION_FULL;
+
+static const char * const mds_strings[] = {
+ [MDS_MITIGATION_OFF] = "Vulnerable",
+ [MDS_MITIGATION_FULL] = "Mitigation: Clear CPU buffers"
+};
+
+static void __init mds_select_mitigation(void)
+{
+ if (!boot_cpu_has_bug(X86_BUG_MDS)) {
+ mds_mitigation = MDS_MITIGATION_OFF;
+ return;
+ }
+
+ if (mds_mitigation == MDS_MITIGATION_FULL) {
+ if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
+ static_branch_enable(&mds_user_clear);
+ else
+ mds_mitigation = MDS_MITIGATION_OFF;
+ }
+ pr_info("%s\n", mds_strings[mds_mitigation]);
+}
+
+static int __init mds_cmdline(char *str)
+{
+ if (!boot_cpu_has_bug(X86_BUG_MDS))
+ return 0;
+
+ if (!str)
+ return -EINVAL;
+
+ if (!strcmp(str, "off"))
+ mds_mitigation = MDS_MITIGATION_OFF;
+ else if (!strcmp(str, "full"))
+ mds_mitigation = MDS_MITIGATION_FULL;
+
+ return 0;
+}
+early_param("mds", mds_cmdline);
+
#undef pr_fmt
#define pr_fmt(fmt) "Spectre V2 : " fmt
@@ -603,6 +650,26 @@ static void update_indir_branch_cond(void)
static_branch_disable(&switch_to_cond_stibp);
}
+/* Update the static key controlling the MDS CPU buffer clear in idle */
+static void update_mds_branch_idle(void)
+{
+ /*
+ * Enable the idle clearing if SMT is active on CPUs which are
+ * affected only by MSBDS and not any other MDS variant.
+ *
+ * The other variants cannot be mitigated when SMT is enabled, so
+ * clearing the buffers on idle just to prevent the Store Buffer
+ * repartitioning leak would be a window dressing exercise.
+ */
+ if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY))
+ return;
+
+ if (sched_smt_active())
+ static_branch_enable(&mds_idle_clear);
+ else
+ static_branch_disable(&mds_idle_clear);
+}
+
void arch_smt_update(void)
{
/* Enhanced IBRS implies STIBP. No update required. */
@@ -623,6 +690,9 @@ void arch_smt_update(void)
break;
}
+ if (mds_mitigation == MDS_MITIGATION_FULL)
+ update_mds_branch_idle();
+
mutex_unlock(&spec_ctrl_mutex);
}

View File

@ -1,127 +0,0 @@
From dc5e8f6e2934df2c4a459c932d123843d0c2375d Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 18 Feb 2019 22:51:43 +0100
Subject: [PATCH 14/30] x86/speculation/mds: Add sysfs reporting for MDS
commit 8a4b06d391b0a42a373808979b5028f5c84d9c6a upstream
Add the sysfs reporting file for MDS. It exposes the vulnerability and
mitigation state similar to the existing files for the other speculative
hardware vulnerabilities.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
.../ABI/testing/sysfs-devices-system-cpu | 1 +
arch/x86/kernel/cpu/bugs.c | 25 +++++++++++++++++++
drivers/base/cpu.c | 8 ++++++
include/linux/cpu.h | 2 ++
4 files changed, 36 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 73318225a368..02b7bb711214 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -477,6 +477,7 @@ What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/spectre_v2
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
/sys/devices/system/cpu/vulnerabilities/l1tf
+ /sys/devices/system/cpu/vulnerabilities/mds
Date: January 2018
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Information about CPU vulnerabilities
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 413a672f03a3..50b7d2a980e8 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1154,6 +1154,22 @@ static ssize_t l1tf_show_state(char *buf)
}
#endif
+static ssize_t mds_show_state(char *buf)
+{
+ if (!hypervisor_is_type(X86_HYPER_NATIVE)) {
+ return sprintf(buf, "%s; SMT Host state unknown\n",
+ mds_strings[mds_mitigation]);
+ }
+
+ if (boot_cpu_has(X86_BUG_MSBDS_ONLY)) {
+ return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation],
+ sched_smt_active() ? "mitigated" : "disabled");
+ }
+
+ return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation],
+ sched_smt_active() ? "vulnerable" : "disabled");
+}
+
static char *stibp_state(void)
{
if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
@@ -1218,6 +1234,10 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
if (boot_cpu_has(X86_FEATURE_L1TF_PTEINV))
return l1tf_show_state(buf);
break;
+
+ case X86_BUG_MDS:
+ return mds_show_state(buf);
+
default:
break;
}
@@ -1249,4 +1269,9 @@ ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char *b
{
return cpu_show_common(dev, attr, buf, X86_BUG_L1TF);
}
+
+ssize_t cpu_show_mds(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ return cpu_show_common(dev, attr, buf, X86_BUG_MDS);
+}
#endif
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index eb9443d5bae1..2fd6ca1021c2 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -546,11 +546,18 @@ ssize_t __weak cpu_show_l1tf(struct device *dev,
return sprintf(buf, "Not affected\n");
}
+ssize_t __weak cpu_show_mds(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ return sprintf(buf, "Not affected\n");
+}
+
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL);
static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL);
+static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_meltdown.attr,
@@ -558,6 +565,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_spectre_v2.attr,
&dev_attr_spec_store_bypass.attr,
&dev_attr_l1tf.attr,
+ &dev_attr_mds.attr,
NULL
};
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 5041357d0297..3c87ad888ed3 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -57,6 +57,8 @@ extern ssize_t cpu_show_spec_store_bypass(struct device *dev,
struct device_attribute *attr, char *buf);
extern ssize_t cpu_show_l1tf(struct device *dev,
struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_mds(struct device *dev,
+ struct device_attribute *attr, char *buf);
extern __printf(4, 5)
struct device *cpu_device_create(struct device *parent, void *drvdata,

View File

@ -1,129 +0,0 @@
From 66260821438fb5c3e8b4b662c1ebd6ba0b077c09 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 20 Feb 2019 09:40:40 +0100
Subject: [PATCH 15/30] x86/speculation/mds: Add mitigation mode VMWERV
commit 22dd8365088b6403630b82423cf906491859b65e upstream
In virtualized environments it can happen that the host has the microcode
update which utilizes the VERW instruction to clear CPU buffers, but the
hypervisor is not yet updated to expose the X86_FEATURE_MD_CLEAR CPUID bit
to guests.
Introduce an internal mitigation mode VMWERV which enables the invocation
of the CPU buffer clearing even if X86_FEATURE_MD_CLEAR is not set. If the
system has no updated microcode this results in a pointless execution of
the VERW instruction wasting a few CPU cycles. If the microcode is updated,
but not exposed to a guest then the CPU buffers will be cleared.
That said: Virtual Machines Will Eventually Receive Vaccine
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jon Masters <jcm@redhat.com>
Tested-by: Jon Masters <jcm@redhat.com>
---
Documentation/x86/mds.rst | 27 +++++++++++++++++++++++++++
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/bugs.c | 18 ++++++++++++------
3 files changed, 40 insertions(+), 6 deletions(-)
diff --git a/Documentation/x86/mds.rst b/Documentation/x86/mds.rst
index 87ce8ac9f36e..3d6f943f1afb 100644
--- a/Documentation/x86/mds.rst
+++ b/Documentation/x86/mds.rst
@@ -93,11 +93,38 @@ enters a C-state.
The mitigation is invoked on kernel/userspace, hypervisor/guest and C-state
(idle) transitions.
+As a special quirk to address virtualization scenarios where the host has
+the microcode updated, but the hypervisor does not (yet) expose the
+MD_CLEAR CPUID bit to guests, the kernel issues the VERW instruction in the
+hope that it might actually clear the buffers. The state is reflected
+accordingly.
+
According to current knowledge additional mitigations inside the kernel
itself are not required because the necessary gadgets to expose the leaked
data cannot be controlled in a way which allows exploitation from malicious
user space or VM guests.
+Kernel internal mitigation modes
+--------------------------------
+
+ ======= ============================================================
+ off Mitigation is disabled. Either the CPU is not affected or
+ mds=off is supplied on the kernel command line
+
+ full Mitigation is eanbled. CPU is affected and MD_CLEAR is
+ advertised in CPUID.
+
+ vmwerv Mitigation is enabled. CPU is affected and MD_CLEAR is not
+ advertised in CPUID. That is mainly for virtualization
+ scenarios where the host has the updated microcode but the
+ hypervisor does not expose MD_CLEAR in CPUID. It's a best
+ effort approach without guarantee.
+ ======= ============================================================
+
+If the CPU is affected and mds=off is not supplied on the kernel command
+line then the kernel selects the appropriate mitigation mode depending on
+the availability of the MD_CLEAR CPUID bit.
+
Mitigation points
-----------------
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 5e9f953face0..b54f25697beb 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -1000,6 +1000,7 @@ extern enum l1tf_mitigations l1tf_mitigation;
enum mds_mitigations {
MDS_MITIGATION_OFF,
MDS_MITIGATION_FULL,
+ MDS_MITIGATION_VMWERV,
};
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 50b7d2a980e8..053d71a3b9cc 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -222,7 +222,8 @@ static enum mds_mitigations mds_mitigation __ro_after_init = MDS_MITIGATION_FULL
static const char * const mds_strings[] = {
[MDS_MITIGATION_OFF] = "Vulnerable",
- [MDS_MITIGATION_FULL] = "Mitigation: Clear CPU buffers"
+ [MDS_MITIGATION_FULL] = "Mitigation: Clear CPU buffers",
+ [MDS_MITIGATION_VMWERV] = "Vulnerable: Clear CPU buffers attempted, no microcode",
};
static void __init mds_select_mitigation(void)
@@ -233,10 +234,9 @@ static void __init mds_select_mitigation(void)
}
if (mds_mitigation == MDS_MITIGATION_FULL) {
- if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
- static_branch_enable(&mds_user_clear);
- else
- mds_mitigation = MDS_MITIGATION_OFF;
+ if (!boot_cpu_has(X86_FEATURE_MD_CLEAR))
+ mds_mitigation = MDS_MITIGATION_VMWERV;
+ static_branch_enable(&mds_user_clear);
}
pr_info("%s\n", mds_strings[mds_mitigation]);
}
@@ -690,8 +690,14 @@ void arch_smt_update(void)
break;
}
- if (mds_mitigation == MDS_MITIGATION_FULL)
+ switch (mds_mitigation) {
+ case MDS_MITIGATION_FULL:
+ case MDS_MITIGATION_VMWERV:
update_mds_branch_idle();
+ break;
+ case MDS_MITIGATION_OFF:
+ break;
+ }
mutex_unlock(&spec_ctrl_mutex);
}

View File

@ -1,123 +0,0 @@
From 83852a8b0064f1360980a690792c3f438aec06b9 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 19 Feb 2019 11:10:49 +0100
Subject: [PATCH 16/30] Documentation: Move L1TF to separate directory
commit 65fd4cb65b2dad97feb8330b6690445910b56d6a upstream
Move L!TF to a separate directory so the MDS stuff can be added at the
side. Otherwise the all hardware vulnerabilites have their own top level
entry. Should have done that right away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jon Masters <jcm@redhat.com>
---
Documentation/ABI/testing/sysfs-devices-system-cpu | 2 +-
Documentation/admin-guide/hw-vuln/index.rst | 12 ++++++++++++
Documentation/admin-guide/{ => hw-vuln}/l1tf.rst | 0
Documentation/admin-guide/index.rst | 6 ++----
Documentation/admin-guide/kernel-parameters.txt | 2 +-
arch/x86/kernel/cpu/bugs.c | 2 +-
arch/x86/kvm/vmx.c | 4 ++--
7 files changed, 19 insertions(+), 9 deletions(-)
create mode 100644 Documentation/admin-guide/hw-vuln/index.rst
rename Documentation/admin-guide/{ => hw-vuln}/l1tf.rst (100%)
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 02b7bb711214..f397c2382171 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -491,7 +491,7 @@ Description: Information about CPU vulnerabilities
"Mitigation: $M" CPU is affected and mitigation $M is in effect
Details about the l1tf file can be found in
- Documentation/admin-guide/l1tf.rst
+ Documentation/admin-guide/hw-vuln/l1tf.rst
What: /sys/devices/system/cpu/smt
/sys/devices/system/cpu/smt/active
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
new file mode 100644
index 000000000000..8ce2009f1981
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -0,0 +1,12 @@
+========================
+Hardware vulnerabilities
+========================
+
+This section describes CPU vulnerabilities and provides an overview of the
+possible mitigations along with guidance for selecting mitigations if they
+are configurable at compile, boot or run time.
+
+.. toctree::
+ :maxdepth: 1
+
+ l1tf
diff --git a/Documentation/admin-guide/l1tf.rst b/Documentation/admin-guide/hw-vuln/l1tf.rst
similarity index 100%
rename from Documentation/admin-guide/l1tf.rst
rename to Documentation/admin-guide/hw-vuln/l1tf.rst
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 0873685bab0f..89abc5057349 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -17,14 +17,12 @@ etc.
kernel-parameters
devices
-This section describes CPU vulnerabilities and provides an overview of the
-possible mitigations along with guidance for selecting mitigations if they
-are configurable at compile, boot or run time.
+This section describes CPU vulnerabilities and their mitigations.
.. toctree::
:maxdepth: 1
- l1tf
+ hw-vuln/index
Here is a set of documents aimed at users who are trying to track down
problems and bugs in particular.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index a0ab4521d7c5..b2c9e47c4167 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2079,7 +2079,7 @@
Default is 'flush'.
- For details see: Documentation/admin-guide/l1tf.rst
+ For details see: Documentation/admin-guide/hw-vuln/l1tf.rst
l2cr= [PPC]
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 053d71a3b9cc..a7e54a91abc4 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1089,7 +1089,7 @@ static void __init l1tf_select_mitigation(void)
pr_info("You may make it effective by booting the kernel with mem=%llu parameter.\n",
half_pa);
pr_info("However, doing so will make a part of your RAM unusable.\n");
- pr_info("Reading https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html might help you decide.\n");
+ pr_info("Reading https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html might help you decide.\n");
return;
}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e9bf477209dc..73d6d585dd66 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11130,8 +11130,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
return ERR_PTR(err);
}
-#define L1TF_MSG_SMT "L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html for details.\n"
-#define L1TF_MSG_L1D "L1TF CPU bug present and virtualization mitigation disabled, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html for details.\n"
+#define L1TF_MSG_SMT "L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n"
+#define L1TF_MSG_L1D "L1TF CPU bug present and virtualization mitigation disabled, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n"
static int vmx_vm_init(struct kvm *kvm)
{

View File

@ -1,381 +0,0 @@
From 4fb7f5dc689d79884f4e6e33f5d2704f44edd42a Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 19 Feb 2019 00:02:31 +0100
Subject: [PATCH 17/30] Documentation: Add MDS vulnerability documentation
commit 5999bbe7a6ea3c62029532ec84dc06003a1fa258 upstream
Add the initial MDS vulnerability documentation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jon Masters <jcm@redhat.com>
---
.../ABI/testing/sysfs-devices-system-cpu | 3 +-
Documentation/admin-guide/hw-vuln/index.rst | 1 +
Documentation/admin-guide/hw-vuln/l1tf.rst | 1 +
Documentation/admin-guide/hw-vuln/mds.rst | 307 ++++++++++++++++++
.../admin-guide/kernel-parameters.txt | 2 +
5 files changed, 312 insertions(+), 2 deletions(-)
create mode 100644 Documentation/admin-guide/hw-vuln/mds.rst
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index f397c2382171..8718d4ad227b 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -490,8 +490,7 @@ Description: Information about CPU vulnerabilities
"Vulnerable" CPU is affected and no mitigation in effect
"Mitigation: $M" CPU is affected and mitigation $M is in effect
- Details about the l1tf file can be found in
- Documentation/admin-guide/hw-vuln/l1tf.rst
+ See also: Documentation/admin-guide/hw-vuln/index.rst
What: /sys/devices/system/cpu/smt
/sys/devices/system/cpu/smt/active
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index 8ce2009f1981..ffc064c1ec68 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -10,3 +10,4 @@ are configurable at compile, boot or run time.
:maxdepth: 1
l1tf
+ mds
diff --git a/Documentation/admin-guide/hw-vuln/l1tf.rst b/Documentation/admin-guide/hw-vuln/l1tf.rst
index 9af977384168..31653a9f0e1b 100644
--- a/Documentation/admin-guide/hw-vuln/l1tf.rst
+++ b/Documentation/admin-guide/hw-vuln/l1tf.rst
@@ -445,6 +445,7 @@ The default is 'cond'. If 'l1tf=full,force' is given on the kernel command
line, then 'always' is enforced and the kvm-intel.vmentry_l1d_flush
module parameter is ignored and writes to the sysfs file are rejected.
+.. _mitigation_selection:
Mitigation selection guide
--------------------------
diff --git a/Documentation/admin-guide/hw-vuln/mds.rst b/Documentation/admin-guide/hw-vuln/mds.rst
new file mode 100644
index 000000000000..1de29d28903d
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/mds.rst
@@ -0,0 +1,307 @@
+MDS - Microarchitectural Data Sampling
+======================================
+
+Microarchitectural Data Sampling is a hardware vulnerability which allows
+unprivileged speculative access to data which is available in various CPU
+internal buffers.
+
+Affected processors
+-------------------
+
+This vulnerability affects a wide range of Intel processors. The
+vulnerability is not present on:
+
+ - Processors from AMD, Centaur and other non Intel vendors
+
+ - Older processor models, where the CPU family is < 6
+
+ - Some Atoms (Bonnell, Saltwell, Goldmont, GoldmontPlus)
+
+ - Intel processors which have the ARCH_CAP_MDS_NO bit set in the
+ IA32_ARCH_CAPABILITIES MSR.
+
+Whether a processor is affected or not can be read out from the MDS
+vulnerability file in sysfs. See :ref:`mds_sys_info`.
+
+Not all processors are affected by all variants of MDS, but the mitigation
+is identical for all of them so the kernel treats them as a single
+vulnerability.
+
+Related CVEs
+------------
+
+The following CVE entries are related to the MDS vulnerability:
+
+ ============== ===== ==============================================
+ CVE-2018-12126 MSBDS Microarchitectural Store Buffer Data Sampling
+ CVE-2018-12130 MFBDS Microarchitectural Fill Buffer Data Sampling
+ CVE-2018-12127 MLPDS Microarchitectural Load Port Data Sampling
+ ============== ===== ==============================================
+
+Problem
+-------
+
+When performing store, load, L1 refill operations, processors write data
+into temporary microarchitectural structures (buffers). The data in the
+buffer can be forwarded to load operations as an optimization.
+
+Under certain conditions, usually a fault/assist caused by a load
+operation, data unrelated to the load memory address can be speculatively
+forwarded from the buffers. Because the load operation causes a fault or
+assist and its result will be discarded, the forwarded data will not cause
+incorrect program execution or state changes. But a malicious operation
+may be able to forward this speculative data to a disclosure gadget which
+allows in turn to infer the value via a cache side channel attack.
+
+Because the buffers are potentially shared between Hyper-Threads cross
+Hyper-Thread attacks are possible.
+
+Deeper technical information is available in the MDS specific x86
+architecture section: :ref:`Documentation/x86/mds.rst <mds>`.
+
+
+Attack scenarios
+----------------
+
+Attacks against the MDS vulnerabilities can be mounted from malicious non
+priviledged user space applications running on hosts or guest. Malicious
+guest OSes can obviously mount attacks as well.
+
+Contrary to other speculation based vulnerabilities the MDS vulnerability
+does not allow the attacker to control the memory target address. As a
+consequence the attacks are purely sampling based, but as demonstrated with
+the TLBleed attack samples can be postprocessed successfully.
+
+Web-Browsers
+^^^^^^^^^^^^
+
+ It's unclear whether attacks through Web-Browsers are possible at
+ all. The exploitation through Java-Script is considered very unlikely,
+ but other widely used web technologies like Webassembly could possibly be
+ abused.
+
+
+.. _mds_sys_info:
+
+MDS system information
+-----------------------
+
+The Linux kernel provides a sysfs interface to enumerate the current MDS
+status of the system: whether the system is vulnerable, and which
+mitigations are active. The relevant sysfs file is:
+
+/sys/devices/system/cpu/vulnerabilities/mds
+
+The possible values in this file are:
+
+ ========================================= =================================
+ 'Not affected' The processor is not vulnerable
+
+ 'Vulnerable' The processor is vulnerable,
+ but no mitigation enabled
+
+ 'Vulnerable: Clear CPU buffers attempted' The processor is vulnerable but
+ microcode is not updated.
+ The mitigation is enabled on a
+ best effort basis.
+ See :ref:`vmwerv`
+
+ 'Mitigation: CPU buffer clear' The processor is vulnerable and the
+ CPU buffer clearing mitigation is
+ enabled.
+ ========================================= =================================
+
+If the processor is vulnerable then the following information is appended
+to the above information:
+
+ ======================== ============================================
+ 'SMT vulnerable' SMT is enabled
+ 'SMT mitigated' SMT is enabled and mitigated
+ 'SMT disabled' SMT is disabled
+ 'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown
+ ======================== ============================================
+
+.. _vmwerv:
+
+Best effort mitigation mode
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ If the processor is vulnerable, but the availability of the microcode based
+ mitigation mechanism is not advertised via CPUID the kernel selects a best
+ effort mitigation mode. This mode invokes the mitigation instructions
+ without a guarantee that they clear the CPU buffers.
+
+ This is done to address virtualization scenarios where the host has the
+ microcode update applied, but the hypervisor is not yet updated to expose
+ the CPUID to the guest. If the host has updated microcode the protection
+ takes effect otherwise a few cpu cycles are wasted pointlessly.
+
+ The state in the mds sysfs file reflects this situation accordingly.
+
+
+Mitigation mechanism
+-------------------------
+
+The kernel detects the affected CPUs and the presence of the microcode
+which is required.
+
+If a CPU is affected and the microcode is available, then the kernel
+enables the mitigation by default. The mitigation can be controlled at boot
+time via a kernel command line option. See
+:ref:`mds_mitigation_control_command_line`.
+
+.. _cpu_buffer_clear:
+
+CPU buffer clearing
+^^^^^^^^^^^^^^^^^^^
+
+ The mitigation for MDS clears the affected CPU buffers on return to user
+ space and when entering a guest.
+
+ If SMT is enabled it also clears the buffers on idle entry when the CPU
+ is only affected by MSBDS and not any other MDS variant, because the
+ other variants cannot be protected against cross Hyper-Thread attacks.
+
+ For CPUs which are only affected by MSBDS the user space, guest and idle
+ transition mitigations are sufficient and SMT is not affected.
+
+.. _virt_mechanism:
+
+Virtualization mitigation
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The protection for host to guest transition depends on the L1TF
+ vulnerability of the CPU:
+
+ - CPU is affected by L1TF:
+
+ If the L1D flush mitigation is enabled and up to date microcode is
+ available, the L1D flush mitigation is automatically protecting the
+ guest transition.
+
+ If the L1D flush mitigation is disabled then the MDS mitigation is
+ invoked explicit when the host MDS mitigation is enabled.
+
+ For details on L1TF and virtualization see:
+ :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <mitigation_control_kvm>`.
+
+ - CPU is not affected by L1TF:
+
+ CPU buffers are flushed before entering the guest when the host MDS
+ mitigation is enabled.
+
+ The resulting MDS protection matrix for the host to guest transition:
+
+ ============ ===== ============= ============ =================
+ L1TF MDS VMX-L1FLUSH Host MDS MDS-State
+
+ Don't care No Don't care N/A Not affected
+
+ Yes Yes Disabled Off Vulnerable
+
+ Yes Yes Disabled Full Mitigated
+
+ Yes Yes Enabled Don't care Mitigated
+
+ No Yes N/A Off Vulnerable
+
+ No Yes N/A Full Mitigated
+ ============ ===== ============= ============ =================
+
+ This only covers the host to guest transition, i.e. prevents leakage from
+ host to guest, but does not protect the guest internally. Guests need to
+ have their own protections.
+
+.. _xeon_phi:
+
+XEON PHI specific considerations
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The XEON PHI processor family is affected by MSBDS which can be exploited
+ cross Hyper-Threads when entering idle states. Some XEON PHI variants allow
+ to use MWAIT in user space (Ring 3) which opens an potential attack vector
+ for malicious user space. The exposure can be disabled on the kernel
+ command line with the 'ring3mwait=disable' command line option.
+
+ XEON PHI is not affected by the other MDS variants and MSBDS is mitigated
+ before the CPU enters a idle state. As XEON PHI is not affected by L1TF
+ either disabling SMT is not required for full protection.
+
+.. _mds_smt_control:
+
+SMT control
+^^^^^^^^^^^
+
+ All MDS variants except MSBDS can be attacked cross Hyper-Threads. That
+ means on CPUs which are affected by MFBDS or MLPDS it is necessary to
+ disable SMT for full protection. These are most of the affected CPUs; the
+ exception is XEON PHI, see :ref:`xeon_phi`.
+
+ Disabling SMT can have a significant performance impact, but the impact
+ depends on the type of workloads.
+
+ See the relevant chapter in the L1TF mitigation documentation for details:
+ :ref:`Documentation/admin-guide/hw-vuln/l1tf.rst <smt_control>`.
+
+
+.. _mds_mitigation_control_command_line:
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+The kernel command line allows to control the MDS mitigations at boot
+time with the option "mds=". The valid arguments for this option are:
+
+ ============ =============================================================
+ full If the CPU is vulnerable, enable all available mitigations
+ for the MDS vulnerability, CPU buffer clearing on exit to
+ userspace and when entering a VM. Idle transitions are
+ protected as well if SMT is enabled.
+
+ It does not automatically disable SMT.
+
+ off Disables MDS mitigations completely.
+
+ ============ =============================================================
+
+Not specifying this option is equivalent to "mds=full".
+
+
+Mitigation selection guide
+--------------------------
+
+1. Trusted userspace
+^^^^^^^^^^^^^^^^^^^^
+
+ If all userspace applications are from a trusted source and do not
+ execute untrusted code which is supplied externally, then the mitigation
+ can be disabled.
+
+
+2. Virtualization with trusted guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The same considerations as above versus trusted user space apply.
+
+3. Virtualization with untrusted guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The protection depends on the state of the L1TF mitigations.
+ See :ref:`virt_mechanism`.
+
+ If the MDS mitigation is enabled and SMT is disabled, guest to host and
+ guest to guest attacks are prevented.
+
+.. _mds_default_mitigations:
+
+Default mitigations
+-------------------
+
+ The kernel default mitigations for vulnerable processors are:
+
+ - Enable CPU buffer clearing
+
+ The kernel does not by default enforce the disabling of SMT, which leaves
+ SMT systems vulnerable when running untrusted code. The same rationale as
+ for L1TF applies.
+ See :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <default_mitigations>`.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index b2c9e47c4167..290f0946f2ef 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2341,6 +2341,8 @@
Not specifying this option is equivalent to
mds=full.
+ For details see: Documentation/admin-guide/hw-vuln/mds.rst
+
mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
Amount of memory to be used when the kernel is not able
to see the whole system memory or for test.

View File

@ -1,88 +0,0 @@
From 71cd118bd3491d54b45c8185bb0d8c3a2466424f Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Tue, 2 Apr 2019 09:59:33 -0500
Subject: [PATCH 18/30] x86/speculation/mds: Add mds=full,nosmt cmdline option
commit d71eb0ce109a124b0fa714832823b9452f2762cf upstream
Add the mds=full,nosmt cmdline option. This is like mds=full, but with
SMT disabled if the CPU is vulnerable.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
---
Documentation/admin-guide/hw-vuln/mds.rst | 3 +++
Documentation/admin-guide/kernel-parameters.txt | 6 ++++--
arch/x86/kernel/cpu/bugs.c | 10 ++++++++++
3 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/mds.rst b/Documentation/admin-guide/hw-vuln/mds.rst
index 1de29d28903d..244ab47d1fb3 100644
--- a/Documentation/admin-guide/hw-vuln/mds.rst
+++ b/Documentation/admin-guide/hw-vuln/mds.rst
@@ -260,6 +260,9 @@ The kernel command line allows to control the MDS mitigations at boot
It does not automatically disable SMT.
+ full,nosmt The same as mds=full, with SMT disabled on vulnerable
+ CPUs. This is the complete mitigation.
+
off Disables MDS mitigations completely.
============ =============================================================
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 290f0946f2ef..df8d10668b11 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2335,8 +2335,10 @@
This parameter controls the MDS mitigation. The
options are:
- full - Enable MDS mitigation on vulnerable CPUs
- off - Unconditionally disable MDS mitigation
+ full - Enable MDS mitigation on vulnerable CPUs
+ full,nosmt - Enable MDS mitigation and disable
+ SMT on vulnerable CPUs
+ off - Unconditionally disable MDS mitigation
Not specifying this option is equivalent to
mds=full.
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index a7e54a91abc4..3f70da3a4e58 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -219,6 +219,7 @@ static void x86_amd_ssb_disable(void)
/* Default mitigation for L1TF-affected CPUs */
static enum mds_mitigations mds_mitigation __ro_after_init = MDS_MITIGATION_FULL;
+static bool mds_nosmt __ro_after_init = false;
static const char * const mds_strings[] = {
[MDS_MITIGATION_OFF] = "Vulnerable",
@@ -236,8 +237,13 @@ static void __init mds_select_mitigation(void)
if (mds_mitigation == MDS_MITIGATION_FULL) {
if (!boot_cpu_has(X86_FEATURE_MD_CLEAR))
mds_mitigation = MDS_MITIGATION_VMWERV;
+
static_branch_enable(&mds_user_clear);
+
+ if (mds_nosmt && !boot_cpu_has(X86_BUG_MSBDS_ONLY))
+ cpu_smt_disable(false);
}
+
pr_info("%s\n", mds_strings[mds_mitigation]);
}
@@ -253,6 +259,10 @@ static int __init mds_cmdline(char *str)
mds_mitigation = MDS_MITIGATION_OFF;
else if (!strcmp(str, "full"))
mds_mitigation = MDS_MITIGATION_FULL;
+ else if (!strcmp(str, "full,nosmt")) {
+ mds_mitigation = MDS_MITIGATION_FULL;
+ mds_nosmt = true;
+ }
return 0;
}

View File

@ -1,43 +0,0 @@
From 39a5de311379c9c80f0886522296d08121582cad Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Tue, 2 Apr 2019 10:00:14 -0500
Subject: [PATCH 19/30] x86/speculation: Move arch_smt_update() call to after
mitigation decisions
commit 7c3658b20194a5b3209a143f63bc9c643c6a3ae2 upstream
arch_smt_update() now has a dependency on both Spectre v2 and MDS
mitigations. Move its initial call to after all the mitigation decisions
have been made.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
---
arch/x86/kernel/cpu/bugs.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 3f70da3a4e58..6ccbcac2cb1d 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -109,6 +109,8 @@ void __init check_bugs(void)
mds_select_mitigation();
+ arch_smt_update();
+
#ifdef CONFIG_X86_32
/*
* Check whether we are able to run this kernel safely on SMP.
@@ -624,9 +626,6 @@ static void __init spectre_v2_select_mitigation(void)
/* Set up IBPB and STIBP depending on the general spectre V2 command */
spectre_v2_user_select_mitigation(cmd);
-
- /* Enable STIBP if appropriate */
- arch_smt_update();
}
static void update_stibp_msr(void * __unused)

View File

@ -1,58 +0,0 @@
From 7b41a615bd0f13959d9236abd341879c80379069 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Tue, 2 Apr 2019 10:00:51 -0500
Subject: [PATCH 20/30] x86/speculation/mds: Add SMT warning message
commit 39226ef02bfb43248b7db12a4fdccb39d95318e3 upstream
MDS is vulnerable with SMT. Make that clear with a one-time printk
whenever SMT first gets enabled.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
---
arch/x86/kernel/cpu/bugs.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6ccbcac2cb1d..8e74282da80e 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -659,6 +659,9 @@ static void update_indir_branch_cond(void)
static_branch_disable(&switch_to_cond_stibp);
}
+#undef pr_fmt
+#define pr_fmt(fmt) fmt
+
/* Update the static key controlling the MDS CPU buffer clear in idle */
static void update_mds_branch_idle(void)
{
@@ -679,6 +682,8 @@ static void update_mds_branch_idle(void)
static_branch_disable(&mds_idle_clear);
}
+#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
+
void arch_smt_update(void)
{
/* Enhanced IBRS implies STIBP. No update required. */
@@ -702,6 +707,8 @@ void arch_smt_update(void)
switch (mds_mitigation) {
case MDS_MITIGATION_FULL:
case MDS_MITIGATION_VMWERV:
+ if (sched_smt_active() && !boot_cpu_has(X86_BUG_MSBDS_ONLY))
+ pr_warn_once(MDS_MSG_SMT);
update_mds_branch_idle();
break;
case MDS_MITIGATION_OFF:
@@ -1131,6 +1138,7 @@ static int __init l1tf_cmdline(char *str)
early_param("l1tf", l1tf_cmdline);
#undef pr_fmt
+#define pr_fmt(fmt) fmt
#ifdef CONFIG_SYSFS

View File

@ -1,31 +0,0 @@
From 7a4591bd67b72f2f6c3e0520f7a670eb3085ccdb Mon Sep 17 00:00:00 2001
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Date: Fri, 12 Apr 2019 17:50:57 -0400
Subject: [PATCH 21/30] x86/speculation/mds: Fix comment
commit cae5ec342645746d617dd420d206e1588d47768a upstream
s/L1TF/MDS/
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
arch/x86/kernel/cpu/bugs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 8e74282da80e..1726f43853ca 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -219,7 +219,7 @@ static void x86_amd_ssb_disable(void)
#undef pr_fmt
#define pr_fmt(fmt) "MDS: " fmt
-/* Default mitigation for L1TF-affected CPUs */
+/* Default mitigation for MDS-affected CPUs */
static enum mds_mitigations mds_mitigation __ro_after_init = MDS_MITIGATION_FULL;
static bool mds_nosmt __ro_after_init = false;

View File

@ -1,46 +0,0 @@
From 902c28c3ffa6271ab3f6f1104b03683eff7c2ac9 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Fri, 12 Apr 2019 17:50:58 -0400
Subject: [PATCH 22/30] x86/speculation/mds: Print SMT vulnerable on MSBDS with
mitigations off
commit e2c3c94788b08891dcf3dbe608f9880523ecd71b upstream
This code is only for CPUs which are affected by MSBDS, but are *not*
affected by the other two MDS issues.
For such CPUs, enabling the mds_idle_clear mitigation is enough to
mitigate SMT.
However if user boots with 'mds=off' and still has SMT enabled, we should
not report that SMT is mitigated:
$cat /sys//devices/system/cpu/vulnerabilities/mds
Vulnerable; SMT mitigated
But rather:
Vulnerable; SMT vulnerable
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20190412215118.294906495@localhost.localdomain
---
arch/x86/kernel/cpu/bugs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 1726f43853ca..8d432a3d38a3 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1186,7 +1186,8 @@ static ssize_t mds_show_state(char *buf)
if (boot_cpu_has(X86_BUG_MSBDS_ONLY)) {
return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation],
- sched_smt_active() ? "mitigated" : "disabled");
+ (mds_mitigation == MDS_MITIGATION_OFF ? "vulnerable" :
+ sched_smt_active() ? "mitigated" : "disabled"));
}
return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation],

View File

@ -1,164 +0,0 @@
From 0e44e1761b78d31665fbce073ce58f42a0ffd4de Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Fri, 12 Apr 2019 15:39:28 -0500
Subject: [PATCH 23/30] cpu/speculation: Add 'mitigations=' cmdline option
commit 98af8452945c55652de68536afdde3b520fec429 upstream
Keeping track of the number of mitigations for all the CPU speculation
bugs has become overwhelming for many users. It's getting more and more
complicated to decide which mitigations are needed for a given
architecture. Complicating matters is the fact that each arch tends to
have its own custom way to mitigate the same vulnerability.
Most users fall into a few basic categories:
a) they want all mitigations off;
b) they want all reasonable mitigations on, with SMT enabled even if
it's vulnerable; or
c) they want all reasonable mitigations on, with SMT disabled if
vulnerable.
Define a set of curated, arch-independent options, each of which is an
aggregation of existing options:
- mitigations=off: Disable all mitigations.
- mitigations=auto: [default] Enable all the default mitigations, but
leave SMT enabled, even if it's vulnerable.
- mitigations=auto,nosmt: Enable all the default mitigations, disabling
SMT if needed by a mitigation.
Currently, these options are placeholders which don't actually do
anything. They will be fleshed out in upcoming patches.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86)
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-arch@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Phil Auld <pauld@redhat.com>
Link: https://lkml.kernel.org/r/b07a8ef9b7c5055c3a4637c87d07c296d5016fe0.1555085500.git.jpoimboe@redhat.com
---
.../admin-guide/kernel-parameters.txt | 24 +++++++++++++++++++
include/linux/cpu.h | 24 +++++++++++++++++++
kernel/cpu.c | 15 ++++++++++++
3 files changed, 63 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index df8d10668b11..6a1b94afb005 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2502,6 +2502,30 @@
in the "bleeding edge" mini2440 support kernel at
http://repo.or.cz/w/linux-2.6/mini2440.git
+ mitigations=
+ Control optional mitigations for CPU vulnerabilities.
+ This is a set of curated, arch-independent options, each
+ of which is an aggregation of existing arch-specific
+ options.
+
+ off
+ Disable all optional CPU mitigations. This
+ improves system performance, but it may also
+ expose users to several CPU vulnerabilities.
+
+ auto (default)
+ Mitigate all CPU vulnerabilities, but leave SMT
+ enabled, even if it's vulnerable. This is for
+ users who don't want to be surprised by SMT
+ getting disabled across kernel upgrades, or who
+ have other ways of avoiding SMT-based attacks.
+ This is the default behavior.
+
+ auto,nosmt
+ Mitigate all CPU vulnerabilities, disabling SMT
+ if needed. This is for users who always want to
+ be fully mitigated, even if it means losing SMT.
+
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
parameter allows control of the logging verbosity for
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 3c87ad888ed3..57ae83c4d5f4 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -189,4 +189,28 @@ static inline void cpu_smt_disable(bool force) { }
static inline void cpu_smt_check_topology(void) { }
#endif
+/*
+ * These are used for a global "mitigations=" cmdline option for toggling
+ * optional CPU mitigations.
+ */
+enum cpu_mitigations {
+ CPU_MITIGATIONS_OFF,
+ CPU_MITIGATIONS_AUTO,
+ CPU_MITIGATIONS_AUTO_NOSMT,
+};
+
+extern enum cpu_mitigations cpu_mitigations;
+
+/* mitigations=off */
+static inline bool cpu_mitigations_off(void)
+{
+ return cpu_mitigations == CPU_MITIGATIONS_OFF;
+}
+
+/* mitigations=auto,nosmt */
+static inline bool cpu_mitigations_auto_nosmt(void)
+{
+ return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT;
+}
+
#endif /* _LINUX_CPU_H_ */
diff --git a/kernel/cpu.c b/kernel/cpu.c
index dc250ec2c096..bc6c880a093f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2278,3 +2278,18 @@ void __init boot_cpu_hotplug_init(void)
#endif
this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
}
+
+enum cpu_mitigations cpu_mitigations __ro_after_init = CPU_MITIGATIONS_AUTO;
+
+static int __init mitigations_parse_cmdline(char *arg)
+{
+ if (!strcmp(arg, "off"))
+ cpu_mitigations = CPU_MITIGATIONS_OFF;
+ else if (!strcmp(arg, "auto"))
+ cpu_mitigations = CPU_MITIGATIONS_AUTO;
+ else if (!strcmp(arg, "auto,nosmt"))
+ cpu_mitigations = CPU_MITIGATIONS_AUTO_NOSMT;
+
+ return 0;
+}
+early_param("mitigations", mitigations_parse_cmdline);

View File

@ -1,151 +0,0 @@
From d0bf64abd7f837c5faf16e0550d678aed630e520 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Fri, 12 Apr 2019 15:39:29 -0500
Subject: [PATCH 24/30] x86/speculation: Support 'mitigations=' cmdline option
commit d68be4c4d31295ff6ae34a8ddfaa4c1a8ff42812 upstream
Configure x86 runtime CPU speculation bug mitigations in accordance with
the 'mitigations=' cmdline option. This affects Meltdown, Spectre v2,
Speculative Store Bypass, and L1TF.
The default behavior is unchanged.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86)
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-arch@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Phil Auld <pauld@redhat.com>
Link: https://lkml.kernel.org/r/6616d0ae169308516cfdf5216bedd169f8a8291b.1555085500.git.jpoimboe@redhat.com
---
Documentation/admin-guide/kernel-parameters.txt | 16 +++++++++++-----
arch/x86/kernel/cpu/bugs.c | 11 +++++++++--
arch/x86/mm/pti.c | 4 +++-
3 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6a1b94afb005..31c17532c219 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2503,15 +2503,20 @@
http://repo.or.cz/w/linux-2.6/mini2440.git
mitigations=
- Control optional mitigations for CPU vulnerabilities.
- This is a set of curated, arch-independent options, each
- of which is an aggregation of existing arch-specific
- options.
+ [X86] Control optional mitigations for CPU
+ vulnerabilities. This is a set of curated,
+ arch-independent options, each of which is an
+ aggregation of existing arch-specific options.
off
Disable all optional CPU mitigations. This
improves system performance, but it may also
expose users to several CPU vulnerabilities.
+ Equivalent to: nopti [X86]
+ nospectre_v2 [X86]
+ spectre_v2_user=off [X86]
+ spec_store_bypass_disable=off [X86]
+ l1tf=off [X86]
auto (default)
Mitigate all CPU vulnerabilities, but leave SMT
@@ -2519,12 +2524,13 @@
users who don't want to be surprised by SMT
getting disabled across kernel upgrades, or who
have other ways of avoiding SMT-based attacks.
- This is the default behavior.
+ Equivalent to: (default behavior)
auto,nosmt
Mitigate all CPU vulnerabilities, disabling SMT
if needed. This is for users who always want to
be fully mitigated, even if it means losing SMT.
+ Equivalent to: l1tf=flush,nosmt [X86]
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 8d432a3d38a3..904d55cf80a2 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -494,7 +494,8 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
char arg[20];
int ret, i;
- if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
+ if (cmdline_find_option_bool(boot_command_line, "nospectre_v2") ||
+ cpu_mitigations_off())
return SPECTRE_V2_CMD_NONE;
ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg));
@@ -756,7 +757,8 @@ static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void)
char arg[20];
int ret, i;
- if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable")) {
+ if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable") ||
+ cpu_mitigations_off()) {
return SPEC_STORE_BYPASS_CMD_NONE;
} else {
ret = cmdline_find_option(boot_command_line, "spec_store_bypass_disable",
@@ -1077,6 +1079,11 @@ static void __init l1tf_select_mitigation(void)
if (!boot_cpu_has_bug(X86_BUG_L1TF))
return;
+ if (cpu_mitigations_off())
+ l1tf_mitigation = L1TF_MITIGATION_OFF;
+ else if (cpu_mitigations_auto_nosmt())
+ l1tf_mitigation = L1TF_MITIGATION_FLUSH_NOSMT;
+
override_cache_bits(&boot_cpu_data);
switch (l1tf_mitigation) {
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index c1fc1ae6b429..4df3e5c89d57 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -35,6 +35,7 @@
#include <linux/spinlock.h>
#include <linux/mm.h>
#include <linux/uaccess.h>
+#include <linux/cpu.h>
#include <asm/cpufeature.h>
#include <asm/hypervisor.h>
@@ -115,7 +116,8 @@ void __init pti_check_boottime_disable(void)
}
}
- if (cmdline_find_option_bool(boot_command_line, "nopti")) {
+ if (cmdline_find_option_bool(boot_command_line, "nopti") ||
+ cpu_mitigations_off()) {
pti_mode = PTI_FORCE_OFF;
pti_print_if_insecure("disabled on command line.");
return;

View File

@ -1,122 +0,0 @@
From 7cd777a49624b9e186919a5e1b7fec49dafdb8eb Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Fri, 12 Apr 2019 15:39:30 -0500
Subject: [PATCH 25/30] powerpc/speculation: Support 'mitigations=' cmdline
option
commit 782e69efb3dfed6e8360bc612e8c7827a901a8f9 upstream
Configure powerpc CPU runtime speculation bug mitigations in accordance
with the 'mitigations=' cmdline option. This affects Meltdown, Spectre
v1, Spectre v2, and Speculative Store Bypass.
The default behavior is unchanged.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86)
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-arch@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Phil Auld <pauld@redhat.com>
Link: https://lkml.kernel.org/r/245a606e1a42a558a310220312d9b6adb9159df6.1555085500.git.jpoimboe@redhat.com
---
Documentation/admin-guide/kernel-parameters.txt | 9 +++++----
arch/powerpc/kernel/security.c | 6 +++---
arch/powerpc/kernel/setup_64.c | 2 +-
3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 31c17532c219..49aa191979c1 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2503,7 +2503,7 @@
http://repo.or.cz/w/linux-2.6/mini2440.git
mitigations=
- [X86] Control optional mitigations for CPU
+ [X86,PPC] Control optional mitigations for CPU
vulnerabilities. This is a set of curated,
arch-independent options, each of which is an
aggregation of existing arch-specific options.
@@ -2512,10 +2512,11 @@
Disable all optional CPU mitigations. This
improves system performance, but it may also
expose users to several CPU vulnerabilities.
- Equivalent to: nopti [X86]
- nospectre_v2 [X86]
+ Equivalent to: nopti [X86,PPC]
+ nospectre_v1 [PPC]
+ nospectre_v2 [X86,PPC]
spectre_v2_user=off [X86]
- spec_store_bypass_disable=off [X86]
+ spec_store_bypass_disable=off [X86,PPC]
l1tf=off [X86]
auto (default)
diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index 1341325599a7..4ccbf611a3c5 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -56,7 +56,7 @@ void setup_barrier_nospec(void)
enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) &&
security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR);
- if (!no_nospec)
+ if (!no_nospec && !cpu_mitigations_off())
enable_barrier_nospec(enable);
}
@@ -115,7 +115,7 @@ static int __init handle_nospectre_v2(char *p)
early_param("nospectre_v2", handle_nospectre_v2);
void setup_spectre_v2(void)
{
- if (no_spectrev2)
+ if (no_spectrev2 || cpu_mitigations_off())
do_btb_flush_fixups();
else
btb_flush_enabled = true;
@@ -299,7 +299,7 @@ void setup_stf_barrier(void)
stf_enabled_flush_types = type;
- if (!no_stf_barrier)
+ if (!no_stf_barrier && !cpu_mitigations_off())
stf_barrier_enable(enable);
}
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index faf00222b324..eaf7300be5ab 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -955,7 +955,7 @@ void setup_rfi_flush(enum l1d_flush_type types, bool enable)
enabled_flush_types = types;
- if (!no_rfi_flush)
+ if (!no_rfi_flush && !cpu_mitigations_off())
rfi_flush_enable(enable);
}

View File

@ -1,92 +0,0 @@
From cf0b4c8e4d70fe1d282b4e910834f3982ca10eb4 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Fri, 12 Apr 2019 15:39:31 -0500
Subject: [PATCH 26/30] s390/speculation: Support 'mitigations=' cmdline option
commit 0336e04a6520bdaefdb0769d2a70084fa52e81ed upstream
Configure s390 runtime CPU speculation bug mitigations in accordance
with the 'mitigations=' cmdline option. This affects Spectre v1 and
Spectre v2.
The default behavior is unchanged.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86)
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-arch@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Phil Auld <pauld@redhat.com>
Link: https://lkml.kernel.org/r/e4a161805458a5ec88812aac0307ae3908a030fc.1555085500.git.jpoimboe@redhat.com
---
Documentation/admin-guide/kernel-parameters.txt | 5 +++--
arch/s390/kernel/nospec-branch.c | 3 ++-
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 49aa191979c1..4f3efaaa46bd 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2503,7 +2503,7 @@
http://repo.or.cz/w/linux-2.6/mini2440.git
mitigations=
- [X86,PPC] Control optional mitigations for CPU
+ [X86,PPC,S390] Control optional mitigations for CPU
vulnerabilities. This is a set of curated,
arch-independent options, each of which is an
aggregation of existing arch-specific options.
@@ -2514,7 +2514,8 @@
expose users to several CPU vulnerabilities.
Equivalent to: nopti [X86,PPC]
nospectre_v1 [PPC]
- nospectre_v2 [X86,PPC]
+ nobp=0 [S390]
+ nospectre_v2 [X86,PPC,S390]
spectre_v2_user=off [X86]
spec_store_bypass_disable=off [X86,PPC]
l1tf=off [X86]
diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c
index bdddaae96559..649135cbedd5 100644
--- a/arch/s390/kernel/nospec-branch.c
+++ b/arch/s390/kernel/nospec-branch.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/module.h>
#include <linux/device.h>
+#include <linux/cpu.h>
#include <asm/nospec-branch.h>
static int __init nobp_setup_early(char *str)
@@ -58,7 +59,7 @@ early_param("nospectre_v2", nospectre_v2_setup_early);
void __init nospec_auto_detect(void)
{
- if (test_facility(156)) {
+ if (test_facility(156) || cpu_mitigations_off()) {
/*
* The machine supports etokens.
* Disable expolines and disable nobp.

View File

@ -1,59 +0,0 @@
From d15a5678215ac4702f48d739ca3653ad1e568b52 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Wed, 17 Apr 2019 16:39:02 -0500
Subject: [PATCH 27/30] x86/speculation/mds: Add 'mitigations=' support for MDS
commit 5c14068f87d04adc73ba3f41c2a303d3c3d1fa12 upstream
Add MDS to the new 'mitigations=' cmdline option.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
Documentation/admin-guide/kernel-parameters.txt | 2 ++
arch/x86/kernel/cpu/bugs.c | 5 +++--
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 4f3efaaa46bd..a29301d6e6c6 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2519,6 +2519,7 @@
spectre_v2_user=off [X86]
spec_store_bypass_disable=off [X86,PPC]
l1tf=off [X86]
+ mds=off [X86]
auto (default)
Mitigate all CPU vulnerabilities, but leave SMT
@@ -2533,6 +2534,7 @@
if needed. This is for users who always want to
be fully mitigated, even if it means losing SMT.
Equivalent to: l1tf=flush,nosmt [X86]
+ mds=full,nosmt [X86]
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 904d55cf80a2..9b096f26d1c8 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -231,7 +231,7 @@ static const char * const mds_strings[] = {
static void __init mds_select_mitigation(void)
{
- if (!boot_cpu_has_bug(X86_BUG_MDS)) {
+ if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off()) {
mds_mitigation = MDS_MITIGATION_OFF;
return;
}
@@ -242,7 +242,8 @@ static void __init mds_select_mitigation(void)
static_branch_enable(&mds_user_clear);
- if (mds_nosmt && !boot_cpu_has(X86_BUG_MSBDS_ONLY))
+ if (!boot_cpu_has(X86_BUG_MSBDS_ONLY) &&
+ (mds_nosmt || cpu_mitigations_auto_nosmt()))
cpu_smt_disable(false);
}

View File

@ -1,69 +0,0 @@
From ce4dbfe6007776bac14b2435bcf7c17976daeafe Mon Sep 17 00:00:00 2001
From: speck for Pawan Gupta <speck@linutronix.de>
Date: Mon, 6 May 2019 12:23:50 -0700
Subject: [PATCH 28/30] x86/mds: Add MDSUM variant to the MDS documentation
commit e672f8bf71c66253197e503f75c771dd28ada4a0 upstream
Updated the documentation for a new CVE-2019-11091 Microarchitectural Data
Sampling Uncacheable Memory (MDSUM) which is a variant of
Microarchitectural Data Sampling (MDS). MDS is a family of side channel
attacks on internal buffers in Intel CPUs.
MDSUM is a special case of MSBDS, MFBDS and MLPDS. An uncacheable load from
memory that takes a fault or assist can leave data in a microarchitectural
structure that may later be observed using one of the same methods used by
MSBDS, MFBDS or MLPDS. There are no new code changes expected for MDSUM.
The existing mitigation for MDS applies to MDSUM as well.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Jon Masters <jcm@redhat.com>
---
Documentation/admin-guide/hw-vuln/mds.rst | 5 +++--
Documentation/x86/mds.rst | 5 +++++
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/mds.rst b/Documentation/admin-guide/hw-vuln/mds.rst
index 244ab47d1fb3..e0dccf414eca 100644
--- a/Documentation/admin-guide/hw-vuln/mds.rst
+++ b/Documentation/admin-guide/hw-vuln/mds.rst
@@ -32,11 +32,12 @@ Related CVEs
The following CVE entries are related to the MDS vulnerability:
- ============== ===== ==============================================
+ ============== ===== ===================================================
CVE-2018-12126 MSBDS Microarchitectural Store Buffer Data Sampling
CVE-2018-12130 MFBDS Microarchitectural Fill Buffer Data Sampling
CVE-2018-12127 MLPDS Microarchitectural Load Port Data Sampling
- ============== ===== ==============================================
+ CVE-2019-11091 MDSUM Microarchitectural Data Sampling Uncacheable Memory
+ ============== ===== ===================================================
Problem
-------
diff --git a/Documentation/x86/mds.rst b/Documentation/x86/mds.rst
index 3d6f943f1afb..979945be257a 100644
--- a/Documentation/x86/mds.rst
+++ b/Documentation/x86/mds.rst
@@ -12,6 +12,7 @@ Microarchitectural Data Sampling (MDS) is a family of side channel attacks
- Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126)
- Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130)
- Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)
+ - Microarchitectural Data Sampling Uncacheable Memory (MDSUM) (CVE-2019-11091)
MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a
dependent load (store-to-load forwarding) as an optimization. The forward
@@ -38,6 +39,10 @@ faulting or assisting loads under certain conditions, which again can be
exploited eventually. Load ports are shared between Hyper-Threads so cross
thread leakage is possible.
+MDSUM is a special case of MSBDS, MFBDS and MLPDS. An uncacheable load from
+memory that takes a fault or assist can leave data in a microarchitectural
+structure that may later be observed using one of the same methods used by
+MSBDS, MFBDS or MLPDS.
Exposure assumptions
--------------------

View File

@ -1,62 +0,0 @@
From 9cd0e06f7739f9e1a15379e971b3e6c64816d36f Mon Sep 17 00:00:00 2001
From: Tyler Hicks <tyhicks@canonical.com>
Date: Mon, 6 May 2019 23:52:58 +0000
Subject: [PATCH 29/30] Documentation: Correct the possible MDS sysfs values
commit ea01668f9f43021b28b3f4d5ffad50106a1e1301 upstream
Adjust the last two rows in the table that display possible values when
MDS mitigation is enabled. They both were slightly innacurate.
In addition, convert the table of possible values and their descriptions
to a list-table. The simple table format uses the top border of equals
signs to determine cell width which resulted in the first column being
far too wide in comparison to the second column that contained the
majority of the text.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
Documentation/admin-guide/hw-vuln/mds.rst | 29 ++++++++++-------------
1 file changed, 13 insertions(+), 16 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/mds.rst b/Documentation/admin-guide/hw-vuln/mds.rst
index e0dccf414eca..e3a796c0d3a2 100644
--- a/Documentation/admin-guide/hw-vuln/mds.rst
+++ b/Documentation/admin-guide/hw-vuln/mds.rst
@@ -95,22 +95,19 @@ status of the system: whether the system is vulnerable, and which
The possible values in this file are:
- ========================================= =================================
- 'Not affected' The processor is not vulnerable
-
- 'Vulnerable' The processor is vulnerable,
- but no mitigation enabled
-
- 'Vulnerable: Clear CPU buffers attempted' The processor is vulnerable but
- microcode is not updated.
- The mitigation is enabled on a
- best effort basis.
- See :ref:`vmwerv`
-
- 'Mitigation: CPU buffer clear' The processor is vulnerable and the
- CPU buffer clearing mitigation is
- enabled.
- ========================================= =================================
+ .. list-table::
+
+ * - 'Not affected'
+ - The processor is not vulnerable
+ * - 'Vulnerable'
+ - The processor is vulnerable, but no mitigation enabled
+ * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
+ - The processor is vulnerable but microcode is not updated.
+
+ The mitigation is enabled on a best effort basis. See :ref:`vmwerv`
+ * - 'Mitigation: Clear CPU buffers'
+ - The processor is vulnerable and the CPU buffer clearing mitigation is
+ enabled.
If the processor is vulnerable then the following information is appended
to the above information:

View File

@ -1,29 +0,0 @@
From 011c5b50bb2e545174b2bcd1452b4f1a099e79b0 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Tue, 7 May 2019 15:05:22 -0500
Subject: [PATCH 30/30] x86/speculation/mds: Fix documentation typo
commit 95310e348a321b45fb746c176961d4da72344282 upstream
Fix a minor typo in the MDS documentation: "eanbled" -> "enabled".
Reported-by: Jeff Bastian <jbastian@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
Documentation/x86/mds.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/x86/mds.rst b/Documentation/x86/mds.rst
index 979945be257a..534e9baa4e1d 100644
--- a/Documentation/x86/mds.rst
+++ b/Documentation/x86/mds.rst
@@ -116,7 +116,7 @@ Kernel internal mitigation modes
off Mitigation is disabled. Either the CPU is not affected or
mds=off is supplied on the kernel command line
- full Mitigation is eanbled. CPU is affected and MD_CLEAR is
+ full Mitigation is enabled. CPU is affected and MD_CLEAR is
advertised in CPUID.
vmwerv Mitigation is enabled. CPU is affected and MD_CLEAR is not

View File

@ -1,40 +0,0 @@
From: Breno Leitao <leitao@debian.org>
Date: Mon, 22 Oct 2018 11:54:12 -0300
Subject: powerpc/64s: Include cpu header
Origin: https://git.kernel.org/linus/42e2acde1237878462b028f5a27d9cc5bea7502c
Current powerpc security.c file is defining functions, as
cpu_show_meltdown(), cpu_show_spectre_v{1,2} and others, that are being
declared at linux/cpu.h header without including the header file that
contains these declarations.
This is being reported by sparse, which thinks that these functions are
static, due to the lack of declaration:
arch/powerpc/kernel/security.c:105:9: warning: symbol 'cpu_show_meltdown' was not declared. Should it be static?
arch/powerpc/kernel/security.c:139:9: warning: symbol 'cpu_show_spectre_v1' was not declared. Should it be static?
arch/powerpc/kernel/security.c:161:9: warning: symbol 'cpu_show_spectre_v2' was not declared. Should it be static?
arch/powerpc/kernel/security.c:209:6: warning: symbol 'stf_barrier' was not declared. Should it be static?
arch/powerpc/kernel/security.c:289:9: warning: symbol 'cpu_show_spec_store_bypass' was not declared. Should it be static?
This patch simply includes the proper header (linux/cpu.h) to match
function definition and declaration.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/kernel/security.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index f6f469fc4073..9703dce36307 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -4,6 +4,7 @@
//
// Copyright 2018, Michael Ellerman, IBM Corporation.
+#include <linux/cpu.h>
#include <linux/kernel.h>
#include <linux/device.h>
#include <linux/seq_buf.h>

View File

@ -1,123 +0,0 @@
From: Eric Dumazet <edumazet@google.com>
Date: Mon, 17 Jun 2019 10:03:53 -0700
Subject: [PATCH net 3/4] tcp: add tcp_min_snd_mss sysctl
Origin: https://patchwork.ozlabs.org/patch/1117157/
Some TCP peers announce a very small MSS option in their SYN and/or
SYN/ACK messages.
This forces the stack to send packets with a very high network/cpu
overhead.
Linux has enforced a minimal value of 48. Since this value includes
the size of TCP options, and that the options can consume up to 40
bytes, this means that each segment can include only 8 bytes of payload.
In some cases, it can be useful to increase the minimal value
to a saner value.
We still let the default to 48 (TCP_MIN_SND_MSS), for compatibility
reasons.
Note that TCP_MAXSEG socket option enforces a minimal value
of (TCP_MIN_MSS). David Miller increased this minimal value
in commit c39508d6f118 ("tcp: Make TCP_MAXSEG minimum more correct.")
from 64 to 88.
We might in the future merge TCP_MIN_SND_MSS and TCP_MIN_MSS.
CVE-2019-11479 -- tcp mss hardcoded to 48
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
---
Documentation/networking/ip-sysctl.txt | 8 ++++++++
include/net/netns/ipv4.h | 1 +
net/ipv4/sysctl_net_ipv4.c | 11 +++++++++++
net/ipv4/tcp_ipv4.c | 1 +
net/ipv4/tcp_output.c | 3 +--
5 files changed, 22 insertions(+), 2 deletions(-)
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -250,6 +250,14 @@ tcp_base_mss - INTEGER
Path MTU discovery (MTU probing). If MTU probing is enabled,
this is the initial MSS used by the connection.
+tcp_min_snd_mss - INTEGER
+ TCP SYN and SYNACK messages usually advertise an ADVMSS option,
+ as described in RFC 1122 and RFC 6691.
+ If this ADVMSS option is smaller than tcp_min_snd_mss,
+ it is silently capped to tcp_min_snd_mss.
+
+ Default : 48 (at least 8 bytes of payload per segment)
+
tcp_congestion_control - STRING
Set the congestion control algorithm to be used for new
connections. The algorithm "reno" is always available, but
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -113,6 +113,7 @@ struct netns_ipv4 {
#endif
int sysctl_tcp_mtu_probing;
int sysctl_tcp_base_mss;
+ int sysctl_tcp_min_snd_mss;
int sysctl_tcp_probe_threshold;
u32 sysctl_tcp_probe_interval;
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -39,6 +39,8 @@ static int ip_local_port_range_min[] = {
static int ip_local_port_range_max[] = { 65535, 65535 };
static int tcp_adv_win_scale_min = -31;
static int tcp_adv_win_scale_max = 31;
+static int tcp_min_snd_mss_min = TCP_MIN_SND_MSS;
+static int tcp_min_snd_mss_max = 65535;
static int ip_privileged_port_min;
static int ip_privileged_port_max = 65535;
static int ip_ttl_min = 1;
@@ -737,6 +739,15 @@ static struct ctl_table ipv4_net_table[]
.proc_handler = proc_dointvec,
},
{
+ .procname = "tcp_min_snd_mss",
+ .data = &init_net.ipv4.sysctl_tcp_min_snd_mss,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &tcp_min_snd_mss_min,
+ .extra2 = &tcp_min_snd_mss_max,
+ },
+ {
.procname = "tcp_probe_threshold",
.data = &init_net.ipv4.sysctl_tcp_probe_threshold,
.maxlen = sizeof(int),
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2527,6 +2527,7 @@ static int __net_init tcp_sk_init(struct
net->ipv4.sysctl_tcp_ecn_fallback = 1;
net->ipv4.sysctl_tcp_base_mss = TCP_BASE_MSS;
+ net->ipv4.sysctl_tcp_min_snd_mss = TCP_MIN_SND_MSS;
net->ipv4.sysctl_tcp_probe_threshold = TCP_PROBE_THRESHOLD;
net->ipv4.sysctl_tcp_probe_interval = TCP_PROBE_INTERVAL;
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1462,8 +1462,7 @@ static inline int __tcp_mtu_to_mss(struc
mss_now -= icsk->icsk_ext_hdr_len;
/* Then reserve room for full set of TCP options and 8 bytes of data */
- if (mss_now < TCP_MIN_SND_MSS)
- mss_now = TCP_MIN_SND_MSS;
+ mss_now = max(mss_now, sock_net(sk)->ipv4.sysctl_tcp_min_snd_mss);
return mss_now;
}

View File

@ -1,36 +0,0 @@
From: Eric Dumazet <edumazet@google.com>
Date: Mon, 17 Jun 2019 10:03:54 -0700
Subject: [PATCH net 4/4] tcp: enforce tcp_min_snd_mss in tcp_mtu_probing()
Origin: https://patchwork.ozlabs.org/patch/1117158/
If mtu probing is enabled tcp_mtu_probing() could very well end up
with a too small MSS.
Use the new sysctl tcp_min_snd_mss to make sure MSS search
is performed in an acceptable range.
CVE-2019-11479 -- tcp mss hardcoded to 48
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Cc: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Bruce Curtis <brucec@netflix.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
---
net/ipv4/tcp_timer.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -166,6 +166,7 @@ static void tcp_mtu_probing(struct inet_
mss = tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_low) >> 1;
mss = min(net->ipv4.sysctl_tcp_base_mss, mss);
mss = max(mss, 68 - tcp_sk(sk)->tcp_header_len);
+ mss = max(mss, net->ipv4.sysctl_tcp_min_snd_mss);
icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
}
tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);

View File

@ -1,154 +0,0 @@
From: Eric Dumazet <edumazet@google.com>
Date: Mon, 17 Jun 2019 10:03:51 -0700
Subject: [PATCH net 1/4] tcp: limit payload size of sacked skbs
Origin: https://patchwork.ozlabs.org/patch/1117155/
Jonathan Looney reported that TCP can trigger the following crash
in tcp_shifted_skb() :
BUG_ON(tcp_skb_pcount(skb) < pcount);
This can happen if the remote peer has advertized the smallest
MSS that linux TCP accepts : 48
An skb can hold 17 fragments, and each fragment can hold 32KB
on x86, or 64KB on PowerPC.
This means that the 16bit witdh of TCP_SKB_CB(skb)->tcp_gso_segs
can overflow.
Note that tcp_sendmsg() builds skbs with less than 64KB
of payload, so this problem needs SACK to be enabled.
SACK blocks allow TCP to coalesce multiple skbs in the retransmit
queue, thus filling the 17 fragments to maximal capacity.
CVE-2019-11477 -- u16 overflow of TCP_SKB_CB(skb)->tcp_gso_segs
Fixes: 832d11c5cd07 ("tcp: Try to restore large SKBs while SACK processing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
---
include/linux/tcp.h | 4 ++++
include/net/tcp.h | 2 ++
net/ipv4/tcp.c | 1 +
net/ipv4/tcp_input.c | 26 ++++++++++++++++++++------
net/ipv4/tcp_output.c | 6 +++---
5 files changed, 30 insertions(+), 9 deletions(-)
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -485,4 +485,8 @@ static inline u16 tcp_mss_clamp(const st
return (user_mss && user_mss < mss) ? user_mss : mss;
}
+
+int tcp_skb_shift(struct sk_buff *to, struct sk_buff *from, int pcount,
+ int shiftlen);
+
#endif /* _LINUX_TCP_H */
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -55,6 +55,8 @@ void tcp_time_wait(struct sock *sk, int
#define MAX_TCP_HEADER (128 + MAX_HEADER)
#define MAX_TCP_OPTION_SPACE 40
+#define TCP_MIN_SND_MSS 48
+#define TCP_MIN_GSO_SIZE (TCP_MIN_SND_MSS - MAX_TCP_OPTION_SPACE)
/*
* Never offer a window over 32767 without using window scaling. Some
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3829,6 +3829,7 @@ void __init tcp_init(void)
unsigned long limit;
unsigned int i;
+ BUILD_BUG_ON(TCP_MIN_SND_MSS <= MAX_TCP_OPTION_SPACE);
BUILD_BUG_ON(sizeof(struct tcp_skb_cb) >
FIELD_SIZEOF(struct sk_buff, cb));
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1315,7 +1315,7 @@ static bool tcp_shifted_skb(struct sock
TCP_SKB_CB(skb)->seq += shifted;
tcp_skb_pcount_add(prev, pcount);
- BUG_ON(tcp_skb_pcount(skb) < pcount);
+ WARN_ON_ONCE(tcp_skb_pcount(skb) < pcount);
tcp_skb_pcount_add(skb, -pcount);
/* When we're adding to gso_segs == 1, gso_size will be zero,
@@ -1381,6 +1381,21 @@ static int skb_can_shift(const struct sk
return !skb_headlen(skb) && skb_is_nonlinear(skb);
}
+int tcp_skb_shift(struct sk_buff *to, struct sk_buff *from,
+ int pcount, int shiftlen)
+{
+ /* TCP min gso_size is 8 bytes (TCP_MIN_GSO_SIZE)
+ * Since TCP_SKB_CB(skb)->tcp_gso_segs is 16 bits, we need
+ * to make sure not storing more than 65535 * 8 bytes per skb,
+ * even if current MSS is bigger.
+ */
+ if (unlikely(to->len + shiftlen >= 65535 * TCP_MIN_GSO_SIZE))
+ return 0;
+ if (unlikely(tcp_skb_pcount(to) + pcount > 65535))
+ return 0;
+ return skb_shift(to, from, shiftlen);
+}
+
/* Try collapsing SACK blocks spanning across multiple skbs to a single
* skb.
*/
@@ -1486,7 +1501,7 @@ static struct sk_buff *tcp_shift_skb_dat
if (!after(TCP_SKB_CB(skb)->seq + len, tp->snd_una))
goto fallback;
- if (!skb_shift(prev, skb, len))
+ if (!tcp_skb_shift(prev, skb, pcount, len))
goto fallback;
if (!tcp_shifted_skb(sk, prev, skb, state, pcount, len, mss, dup_sack))
goto out;
@@ -1504,11 +1519,10 @@ static struct sk_buff *tcp_shift_skb_dat
goto out;
len = skb->len;
- if (skb_shift(prev, skb, len)) {
- pcount += tcp_skb_pcount(skb);
- tcp_shifted_skb(sk, prev, skb, state, tcp_skb_pcount(skb),
+ pcount = tcp_skb_pcount(skb);
+ if (tcp_skb_shift(prev, skb, pcount, len))
+ tcp_shifted_skb(sk, prev, skb, state, pcount,
len, mss, 0);
- }
out:
return prev;
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1457,8 +1457,8 @@ static inline int __tcp_mtu_to_mss(struc
mss_now -= icsk->icsk_ext_hdr_len;
/* Then reserve room for full set of TCP options and 8 bytes of data */
- if (mss_now < 48)
- mss_now = 48;
+ if (mss_now < TCP_MIN_SND_MSS)
+ mss_now = TCP_MIN_SND_MSS;
return mss_now;
}
@@ -2727,7 +2727,7 @@ static bool tcp_collapse_retrans(struct
if (next_skb_size <= skb_availroom(skb))
skb_copy_bits(next_skb, 0, skb_put(skb, next_skb_size),
next_skb_size);
- else if (!skb_shift(skb, next_skb, next_skb_size))
+ else if (!tcp_skb_shift(skb, next_skb, 1, next_skb_size))
return false;
}
tcp_highest_sack_replace(sk, next_skb, skb);

View File

@ -1,40 +0,0 @@
From: Eric Dumazet <edumazet@google.com>
Date: Fri, 21 Jun 2019 06:09:55 -0700
Subject: tcp: refine memory limit test in tcp_fragment()
Origin: https://git.kernel.org/linus/b6653b3629e5b88202be3c9abc44713973f5c4b4
Bug-Debian: https://bugs.debian.org/930904
tcp_fragment() might be called for skbs in the write queue.
Memory limits might have been exceeded because tcp_sendmsg() only
checks limits at full skb (64KB) boundaries.
Therefore, we need to make sure tcp_fragment() wont punish applications
that might have setup very low SO_SNDBUF values.
Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Christoph Paasch <cpaasch@apple.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/ipv4/tcp_output.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 00c01a01b547..0ebc33d1c9e5 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1296,7 +1296,8 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
if (nsize < 0)
nsize = 0;
- if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
+ if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
+ tcp_queue != TCP_FRAG_IN_WRITE_QUEUE)) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
return -ENOMEM;
}
--
2.20.1

View File

@ -1,70 +0,0 @@
From: Eric Dumazet <edumazet@google.com>
Date: Mon, 17 Jun 2019 10:03:52 -0700
Subject: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits
Origin: https://patchwork.ozlabs.org/patch/1117156/
Jonathan Looney reported that a malicious peer can force a sender
to fragment its retransmit queue into tiny skbs, inflating memory
usage and/or overflow 32bit counters.
TCP allows an application to queue up to sk_sndbuf bytes,
so we need to give some allowance for non malicious splitting
of retransmit queue.
A new SNMP counter is added to monitor how many times TCP
did not allow to split an skb if the allowance was exceeded.
Note that this counter might increase in the case applications
use SO_SNDBUF socket option to lower sk_sndbuf.
CVE-2019-11478 : tcp_fragment, prevent fragmenting a packet when the
socket is already using more than half the allowed space
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
---
include/uapi/linux/snmp.h | 1 +
net/ipv4/proc.c | 1 +
net/ipv4/tcp_output.c | 5 +++++
3 files changed, 7 insertions(+)
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -282,6 +282,7 @@ enum
LINUX_MIB_TCPACKCOMPRESSED, /* TCPAckCompressed */
LINUX_MIB_TCPZEROWINDOWDROP, /* TCPZeroWindowDrop */
LINUX_MIB_TCPRCVQDROP, /* TCPRcvQDrop */
+ LINUX_MIB_TCPWQUEUETOOBIG, /* TCPWqueueTooBig */
__LINUX_MIB_MAX
};
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -290,6 +290,7 @@ static const struct snmp_mib snmp4_net_l
SNMP_MIB_ITEM("TCPAckCompressed", LINUX_MIB_TCPACKCOMPRESSED),
SNMP_MIB_ITEM("TCPZeroWindowDrop", LINUX_MIB_TCPZEROWINDOWDROP),
SNMP_MIB_ITEM("TCPRcvQDrop", LINUX_MIB_TCPRCVQDROP),
+ SNMP_MIB_ITEM("TCPWqueueTooBig", LINUX_MIB_TCPWQUEUETOOBIG),
SNMP_MIB_SENTINEL
};
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1299,6 +1299,11 @@ int tcp_fragment(struct sock *sk, enum t
if (nsize < 0)
nsize = 0;
+ if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
+ return -ENOMEM;
+ }
+
if (skb_unclone(skb, gfp))
return -ENOMEM;

View File

@ -1,137 +0,0 @@
From: Jann Horn <jannh@google.com>
Date: Thu, 4 Apr 2019 23:59:25 +0200
Subject: tracing: Fix buffer_ref pipe ops
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=cffeb9c84d20816a2173e3cfeca210c8bfa8e357
commit b987222654f84f7b4ca95b3a55eca784cb30235b upstream.
This fixes multiple issues in buffer_pipe_buf_ops:
- The ->steal() handler must not return zero unless the pipe buffer has
the only reference to the page. But generic_pipe_buf_steal() assumes
that every reference to the pipe is tracked by the page's refcount,
which isn't true for these buffers - buffer_pipe_buf_get(), which
duplicates a buffer, doesn't touch the page's refcount.
Fix it by using generic_pipe_buf_nosteal(), which refuses every
attempted theft. It should be easy to actually support ->steal, but the
only current users of pipe_buf_steal() are the virtio console and FUSE,
and they also only use it as an optimization. So it's probably not worth
the effort.
- The ->get() and ->release() handlers can be invoked concurrently on pipe
buffers backed by the same struct buffer_ref. Make them safe against
concurrency by using refcount_t.
- The pointers stored in ->private were only zeroed out when the last
reference to the buffer_ref was dropped. As far as I know, this
shouldn't be necessary anyway, but if we do it, let's always do it.
Link: http://lkml.kernel.org/r/20190404215925.253531-1-jannh@google.com
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Fixes: 73a757e63114d ("ring-buffer: Return reader page back into existing ring buffer")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/splice.c | 4 ++--
include/linux/pipe_fs_i.h | 1 +
kernel/trace/trace.c | 28 ++++++++++++++--------------
3 files changed, 17 insertions(+), 16 deletions(-)
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -333,8 +333,8 @@ const struct pipe_buf_operations default
.get = generic_pipe_buf_get,
};
-static int generic_pipe_buf_nosteal(struct pipe_inode_info *pipe,
- struct pipe_buffer *buf)
+int generic_pipe_buf_nosteal(struct pipe_inode_info *pipe,
+ struct pipe_buffer *buf)
{
return 1;
}
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -181,6 +181,7 @@ void free_pipe_info(struct pipe_inode_in
void generic_pipe_buf_get(struct pipe_inode_info *, struct pipe_buffer *);
int generic_pipe_buf_confirm(struct pipe_inode_info *, struct pipe_buffer *);
int generic_pipe_buf_steal(struct pipe_inode_info *, struct pipe_buffer *);
+int generic_pipe_buf_nosteal(struct pipe_inode_info *, struct pipe_buffer *);
void generic_pipe_buf_release(struct pipe_inode_info *, struct pipe_buffer *);
void pipe_buf_mark_unmergeable(struct pipe_buffer *buf);
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6800,19 +6800,23 @@ struct buffer_ref {
struct ring_buffer *buffer;
void *page;
int cpu;
- int ref;
+ refcount_t refcount;
};
+static void buffer_ref_release(struct buffer_ref *ref)
+{
+ if (!refcount_dec_and_test(&ref->refcount))
+ return;
+ ring_buffer_free_read_page(ref->buffer, ref->cpu, ref->page);
+ kfree(ref);
+}
+
static void buffer_pipe_buf_release(struct pipe_inode_info *pipe,
struct pipe_buffer *buf)
{
struct buffer_ref *ref = (struct buffer_ref *)buf->private;
- if (--ref->ref)
- return;
-
- ring_buffer_free_read_page(ref->buffer, ref->cpu, ref->page);
- kfree(ref);
+ buffer_ref_release(ref);
buf->private = 0;
}
@@ -6821,7 +6825,7 @@ static void buffer_pipe_buf_get(struct p
{
struct buffer_ref *ref = (struct buffer_ref *)buf->private;
- ref->ref++;
+ refcount_inc(&ref->refcount);
}
/* Pipe buffer operations for a buffer. */
@@ -6829,7 +6833,7 @@ static const struct pipe_buf_operations
.can_merge = 0,
.confirm = generic_pipe_buf_confirm,
.release = buffer_pipe_buf_release,
- .steal = generic_pipe_buf_steal,
+ .steal = generic_pipe_buf_nosteal,
.get = buffer_pipe_buf_get,
};
@@ -6842,11 +6846,7 @@ static void buffer_spd_release(struct sp
struct buffer_ref *ref =
(struct buffer_ref *)spd->partial[i].private;
- if (--ref->ref)
- return;
-
- ring_buffer_free_read_page(ref->buffer, ref->cpu, ref->page);
- kfree(ref);
+ buffer_ref_release(ref);
spd->partial[i].private = 0;
}
@@ -6901,7 +6901,7 @@ tracing_buffers_splice_read(struct file
break;
}
- ref->ref = 1;
+ refcount_set(&ref->refcount, 1);
ref->buffer = iter->trace_buffer->buffer;
ref->page = ring_buffer_alloc_read_page(ref->buffer, iter->cpu_file);
if (IS_ERR(ref->page)) {

View File

@ -1,94 +0,0 @@
From: Alex Williamson <alex.williamson@redhat.com>
Date: Wed, 3 Apr 2019 12:36:21 -0600
Subject: vfio/type1: Limit DMA mappings per container
Origin: https://git.kernel.org/linus/492855939bdb59c6f947b0b5b44af9ad82b7e38c
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-3882
Memory backed DMA mappings are accounted against a user's locked
memory limit, including multiple mappings of the same memory. This
accounting bounds the number of such mappings that a user can create.
However, DMA mappings that are not backed by memory, such as DMA
mappings of device MMIO via mmaps, do not make use of page pinning
and therefore do not count against the user's locked memory limit.
These mappings still consume memory, but the memory is not well
associated to the process for the purpose of oom killing a task.
To add bounding on this use case, we introduce a limit to the total
number of concurrent DMA mappings that a user is allowed to create.
This limit is exposed as a tunable module option where the default
value of 64K is expected to be well in excess of any reasonable use
case (a large virtual machine configuration would typically only make
use of tens of concurrent mappings).
This fixes CVE-2019-3882.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
drivers/vfio/vfio_iommu_type1.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 73652e21efec..d0f731c9920a 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -58,12 +58,18 @@ module_param_named(disable_hugepages,
MODULE_PARM_DESC(disable_hugepages,
"Disable VFIO IOMMU support for IOMMU hugepages.");
+static unsigned int dma_entry_limit __read_mostly = U16_MAX;
+module_param_named(dma_entry_limit, dma_entry_limit, uint, 0644);
+MODULE_PARM_DESC(dma_entry_limit,
+ "Maximum number of user DMA mappings per container (65535).");
+
struct vfio_iommu {
struct list_head domain_list;
struct vfio_domain *external_domain; /* domain for external user */
struct mutex lock;
struct rb_root dma_list;
struct blocking_notifier_head notifier;
+ unsigned int dma_avail;
bool v2;
bool nesting;
};
@@ -836,6 +842,7 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
vfio_unlink_dma(iommu, dma);
put_task_struct(dma->task);
kfree(dma);
+ iommu->dma_avail++;
}
static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
@@ -1081,12 +1088,18 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
goto out_unlock;
}
+ if (!iommu->dma_avail) {
+ ret = -ENOSPC;
+ goto out_unlock;
+ }
+
dma = kzalloc(sizeof(*dma), GFP_KERNEL);
if (!dma) {
ret = -ENOMEM;
goto out_unlock;
}
+ iommu->dma_avail--;
dma->iova = iova;
dma->vaddr = vaddr;
dma->prot = prot;
@@ -1583,6 +1596,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
INIT_LIST_HEAD(&iommu->domain_list);
iommu->dma_list = RB_ROOT;
+ iommu->dma_avail = dma_entry_limit;
mutex_init(&iommu->lock);
BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
--
2.11.0

View File

@ -1,56 +0,0 @@
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Wed, 13 Feb 2019 18:21:31 -0500
Subject: xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
Origin: https://git.kernel.org/linus/7681f31ec9cdacab4fd10570be924f2cef6669ba
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2015-8553
Bug: http://xenbits.xen.org/xsa/advisory-120.html
There is no need for this at all. Worst it means that if
the guest tries to write to BARs it could lead (on certain
platforms) to PCI SERR errors.
Please note that with af6fc858a35b90e89ea7a7ee58e66628c55c776b
"xen-pciback: limit guest control of command register"
a guest is still allowed to enable those control bits (safely), but
is not allowed to disable them and that therefore a well behaved
frontend which enables things before using them will still
function correctly.
This is done via an write to the configuration register 0x4 which
triggers on the backend side:
command_write
\- pci_enable_device
\- pci_enable_device_flags
\- do_pci_enable_device
\- pcibios_enable_device
\-pci_enable_resourcess
[which enables the PCI_COMMAND_MEMORY|PCI_COMMAND_IO]
However guests (and drivers) which don't do this could cause
problems, including the security issues which XSA-120 sought
to address.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
drivers/xen/xen-pciback/pciback_ops.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/xen/xen-pciback/pciback_ops.c b/drivers/xen/xen-pciback/pciback_ops.c
index ea4a08b83fa0..787966f44589 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -127,8 +127,6 @@ void xen_pcibk_reset_device(struct pci_dev *dev)
if (pci_is_enabled(dev))
pci_disable_device(dev);
- pci_write_config_word(dev, PCI_COMMAND, 0);
-
dev->is_busmaster = 0;
} else {
pci_read_config_word(dev, PCI_COMMAND, &cmd);
--
2.11.0

View File

@ -1,38 +0,0 @@
From: Will Deacon <will.deacon@arm.com>
Date: Wed, 5 Sep 2018 15:34:43 +0100
Subject: arm64: compat: Provide definition for COMPAT_SIGMINSTKSZ
Origin: https://git.kernel.org/linus/24951465cbd279f60b1fdc2421b3694405bcff42
arch/arm/ defines a SIGMINSTKSZ of 2k, so we should use the same value
for compat tasks.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Reported-by: Steve McIntyre <steve.mcintyre@arm.com>
Tested-by: Steve McIntyre <93sam@debian.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm64/include/asm/compat.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h
index 1a037b94eba1..cee28a05ee98 100644
--- a/arch/arm64/include/asm/compat.h
+++ b/arch/arm64/include/asm/compat.h
@@ -159,6 +159,7 @@ static inline compat_uptr_t ptr_to_compat(void __user *uptr)
}
#define compat_user_stack_pointer() (user_stack_pointer(task_pt_regs(current)))
+#define COMPAT_MINSIGSTKSZ 2048
static inline void __user *arch_compat_alloc_user_space(long len)
{
--
2.20.1

View File

@ -1,75 +0,0 @@
From: Paul Burton <paul.burton@mips.com>
Date: Tue, 28 May 2019 17:05:03 +0000
Subject: MIPS: Bounds check virt_addr_valid
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Origin: https://git.kernel.org/linus/074a1e1167afd82c26f6d03a9a8b997d564bb241
The virt_addr_valid() function is meant to return true iff
virt_to_page() will return a valid struct page reference. This is true
iff the address provided is found within the unmapped address range
between PAGE_OFFSET & MAP_BASE, but we don't currently check for that
condition. Instead we simply mask the address to obtain what will be a
physical address if the virtual address is indeed in the desired range,
shift it to form a PFN & then call pfn_valid(). This can incorrectly
return true if called with a virtual address which, after masking,
happens to form a physical address corresponding to a valid PFN.
For example we may vmalloc an address in the kernel mapped region
starting a MAP_BASE & obtain the virtual address:
addr = 0xc000000000002000
When masked by virt_to_phys(), which uses __pa() & in turn CPHYSADDR(),
we obtain the following (bogus) physical address:
addr = 0x2000
In a common system with PHYS_OFFSET=0 this will correspond to a valid
struct page which should really be accessed by virtual address
PAGE_OFFSET+0x2000, causing virt_addr_valid() to incorrectly return 1
indicating that the original address corresponds to a struct page.
This is equivalent to the ARM64 change made in commit ca219452c6b8
("arm64: Correctly bounds check virt_addr_valid").
This fixes fallout when hardened usercopy is enabled caused by the
related commit 517e1fbeb65f ("mm/usercopy: Drop extra
is_vmalloc_or_module() check") which removed a check for the vmalloc
range that was present from the introduction of the hardened usercopy
feature.
Signed-off-by: Paul Burton <paul.burton@mips.com>
References: ca219452c6b8 ("arm64: Correctly bounds check virt_addr_valid")
References: 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check")
Reported-by: Julien Cristau <jcristau@debian.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: YunQiang Su <ysu@wavecomp.com>
URL: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=929366
Cc: stable@vger.kernel.org # v4.12+
Cc: linux-mips@vger.kernel.org
Cc: Yunqiang Su <ysu@wavecomp.com>
---
arch/mips/mm/mmap.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index 2f616ebeb7e0..7755a1fad05a 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -203,6 +203,11 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
int __virt_addr_valid(const volatile void *kaddr)
{
+ unsigned long vaddr = (unsigned long)vaddr;
+
+ if ((vaddr < PAGE_OFFSET) || (vaddr >= MAP_BASE))
+ return 0;
+
return pfn_valid(PFN_DOWN(virt_to_phys(kaddr)));
}
EXPORT_SYMBOL_GPL(__virt_addr_valid);
--
2.20.1

View File

@ -1,52 +0,0 @@
From: Aurelien Jarno <aurelien@aurel32.net>
Date: Tue, 9 Apr 2019 16:53:55 +0200
Subject: MIPS: scall64-o32: Fix indirect syscall number load
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Origin: https://git.kernel.org/linus/79b4a9cf0e2ea8203ce777c8d5cfa86c71eae86e
Commit 4c21b8fd8f14 (MIPS: seccomp: Handle indirect system calls (o32))
added indirect syscall detection for O32 processes running on MIPS64,
but it did not work correctly for big endian kernel/processes. The
reason is that the syscall number is loaded from ARG1 using the lw
instruction while this is a 64-bit value, so zero is loaded instead of
the syscall number.
Fix the code by using the ld instruction instead. When running a 32-bit
processes on a 64 bit CPU, the values are properly sign-extended, so it
ensures the value passed to syscall_trace_enter is correct.
Recent systemd versions with seccomp enabled whitelist the getpid
syscall for their internal processes (e.g. systemd-journald), but call
it through syscall(SYS_getpid). This fix therefore allows O32 big endian
systems with a 64-bit kernel to run recent systemd versions.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Cc: <stable@vger.kernel.org> # v3.15+
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
arch/mips/kernel/scall64-o32.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/kernel/scall64-o32.S b/arch/mips/kernel/scall64-o32.S
index f158c5894a9a..feb2653490df 100644
--- a/arch/mips/kernel/scall64-o32.S
+++ b/arch/mips/kernel/scall64-o32.S
@@ -125,7 +125,7 @@ trace_a_syscall:
subu t1, v0, __NR_O32_Linux
move a1, v0
bnez t1, 1f /* __NR_syscall at offset 0 */
- lw a1, PT_R4(sp) /* Arg1 for __NR_syscall case */
+ ld a1, PT_R4(sp) /* Arg1 for __NR_syscall case */
.set pop
1: jal syscall_trace_enter
--
2.20.1

View File

@ -1,140 +0,0 @@
From: Michael Ellerman <mpe@ellerman.id.au>
Date: Wed, 12 Jun 2019 23:35:07 +1000
Subject: powerpc/mm/64s/hash: Reallocate context ids on fork
Origin: https://git.kernel.org/linus/ca72d88378b2f2444d3ec145dd442d449d3fefbc
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-12817
When using the Hash Page Table (HPT) MMU, userspace memory mappings
are managed at two levels. Firstly in the Linux page tables, much like
other architectures, and secondly in the SLB (Segment Lookaside
Buffer) and HPT. It's the SLB and HPT that are actually used by the
hardware to do translations.
As part of the series adding support for 4PB user virtual address
space using the hash MMU, we added support for allocating multiple
"context ids" per process, one for each 512TB chunk of address space.
These are tracked in an array called extended_id in the mm_context_t
of a process that has done a mapping above 512TB.
If such a process forks (ie. clone(2) without CLONE_VM set) it's mm is
copied, including the mm_context_t, and then init_new_context() is
called to reinitialise parts of the mm_context_t as appropriate to
separate the address spaces of the two processes.
The key step in ensuring the two processes have separate address
spaces is to allocate a new context id for the process, this is done
at the beginning of hash__init_new_context(). If we didn't allocate a
new context id then the two processes would share mappings as far as
the SLB and HPT are concerned, even though their Linux page tables
would be separate.
For mappings above 512TB, which use the extended_id array, we
neglected to allocate new context ids on fork, meaning the parent and
child use the same ids and therefore share those mappings even though
they're supposed to be separate. This can lead to the parent seeing
writes done by the child, which is essentially memory corruption.
There is an additional exposure which is that if the child process
exits, all its context ids are freed, including the context ids that
are still in use by the parent for mappings above 512TB. One or more
of those ids can then be reallocated to a third process, that process
can then read/write to the parent's mappings above 512TB. Additionally
if the freed id is used for the third process's primary context id,
then the parent is able to read/write to the third process's mappings
*below* 512TB.
All of these are fundamental failures to enforce separation between
processes. The only mitigating factor is that the bug only occurs if a
process creates mappings above 512TB, and most applications still do
not create such mappings.
Only machines using the hash page table MMU are affected, eg. PowerPC
970 (G5), PA6T, Power5/6/7/8/9. By default Power9 bare metal machines
(powernv) use the Radix MMU and are not affected, unless the machine
has been explicitly booted in HPT mode (using disable_radix on the
kernel command line). KVM guests on Power9 may be affected if the host
or guest is configured to use the HPT MMU. LPARs under PowerVM on
Power9 are affected as they always use the HPT MMU. Kernels built with
PAGE_SIZE=4K are not affected.
The fix is relatively simple, we need to reallocate context ids for
all extended mappings on fork.
Fixes: f384796c40dc ("powerpc/mm: Add support for handling > 512TB address in SLB miss")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/mm/mmu_context_book3s64.c | 46 +++++++++++++++++++++++---
1 file changed, 42 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index dbd8f762140b..68984d85ad6b 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -53,14 +53,48 @@ int hash__alloc_context_id(void)
}
EXPORT_SYMBOL_GPL(hash__alloc_context_id);
+static int realloc_context_ids(mm_context_t *ctx)
+{
+ int i, id;
+
+ /*
+ * id 0 (aka. ctx->id) is special, we always allocate a new one, even if
+ * there wasn't one allocated previously (which happens in the exec
+ * case where ctx is newly allocated).
+ *
+ * We have to be a bit careful here. We must keep the existing ids in
+ * the array, so that we can test if they're non-zero to decide if we
+ * need to allocate a new one. However in case of error we must free the
+ * ids we've allocated but *not* any of the existing ones (or risk a
+ * UAF). That's why we decrement i at the start of the error handling
+ * loop, to skip the id that we just tested but couldn't reallocate.
+ */
+ for (i = 0; i < ARRAY_SIZE(ctx->extended_id); i++) {
+ if (i == 0 || ctx->extended_id[i]) {
+ id = hash__alloc_context_id();
+ if (id < 0)
+ goto error;
+
+ ctx->extended_id[i] = id;
+ }
+ }
+
+ /* The caller expects us to return id */
+ return ctx->id;
+
+error:
+ for (i--; i >= 0; i--) {
+ if (ctx->extended_id[i])
+ ida_free(&mmu_context_ida, ctx->extended_id[i]);
+ }
+
+ return id;
+}
+
static int hash__init_new_context(struct mm_struct *mm)
{
int index;
- index = hash__alloc_context_id();
- if (index < 0)
- return index;
-
/*
* The old code would re-promote on fork, we don't do that when using
* slices as it could cause problem promoting slices that have been
@@ -78,6 +112,10 @@ static int hash__init_new_context(struct mm_struct *mm)
if (mm->context.id == 0)
slice_init_new_context_exec(mm);
+ index = realloc_context_ids(&mm->context);
+ if (index < 0)
+ return index;
+
subpage_prot_init_new_context(mm);
pkey_mm_init(mm);
--
2.20.1

View File

@ -1,96 +0,0 @@
From: Michael Neuling <mikey@neuling.org>
Date: Fri, 19 Jul 2019 15:05:02 +1000
Subject: powerpc/tm: Fix oops on sigreturn on systems without TM
Origin: https://git.kernel.org/torvalds/c/f16d80b75a096c52354c6e0a574993f3b0dfbdfe
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-13648
commit f16d80b75a096c52354c6e0a574993f3b0dfbdfe upstream.
On systems like P9 powernv where we have no TM (or P8 booted with
ppc_tm=off), userspace can construct a signal context which still has
the MSR TS bits set. The kernel tries to restore this context which
results in the following crash:
Unexpected TM Bad Thing exception at c0000000000022fc (msr 0x8000000102a03031) tm_scratch=800000020280f033
Oops: Unrecoverable exception, sig: 6 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 0 PID: 1636 Comm: sigfuz Not tainted 5.2.0-11043-g0a8ad0ffa4 #69
NIP: c0000000000022fc LR: 00007fffb2d67e48 CTR: 0000000000000000
REGS: c00000003fffbd70 TRAP: 0700 Not tainted (5.2.0-11045-g7142b497d8)
MSR: 8000000102a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[E]> CR: 42004242 XER: 00000000
CFAR: c0000000000022e0 IRQMASK: 0
GPR00: 0000000000000072 00007fffb2b6e560 00007fffb2d87f00 0000000000000669
GPR04: 00007fffb2b6e728 0000000000000000 0000000000000000 00007fffb2b6f2a8
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 00007fffb2b76900 0000000000000000 0000000000000000
GPR16: 00007fffb2370000 00007fffb2d84390 00007fffea3a15ac 000001000a250420
GPR20: 00007fffb2b6f260 0000000010001770 0000000000000000 0000000000000000
GPR24: 00007fffb2d843a0 00007fffea3a14a0 0000000000010000 0000000000800000
GPR28: 00007fffea3a14d8 00000000003d0f00 0000000000000000 00007fffb2b6e728
NIP [c0000000000022fc] rfi_flush_fallback+0x7c/0x80
LR [00007fffb2d67e48] 0x7fffb2d67e48
Call Trace:
Instruction dump:
e96a0220 e96a02a8 e96a0330 e96a03b8 394a0400 4200ffdc 7d2903a6 e92d0c00
e94d0c08 e96d0c10 e82d0c18 7db242a6 <4c000024> 7db243a6 7db142a6 f82d0c18
The problem is the signal code assumes TM is enabled when
CONFIG_PPC_TRANSACTIONAL_MEM is enabled. This may not be the case as
with P9 powernv or if `ppc_tm=off` is used on P8.
This means any local user can crash the system.
Fix the problem by returning a bad stack frame to the user if they try
to set the MSR TS bits with sigreturn() on systems where TM is not
supported.
Found with sigfuz kernel selftest on P9.
This fixes CVE-2019-13648.
Fixes: 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9
Reported-by: Praveen Pandey <Praveen.Pandey@in.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190719050502.405-1-mikey@neuling.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/powerpc/kernel/signal_32.c | 3 +++
arch/powerpc/kernel/signal_64.c | 5 +++++
2 files changed, 8 insertions(+)
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index fd59fef9931b..906b05c2adae 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -1202,6 +1202,9 @@ SYSCALL_DEFINE0(rt_sigreturn)
goto bad;
if (MSR_TM_ACTIVE(msr_hi<<32)) {
+ /* Trying to start TM on non TM system */
+ if (!cpu_has_feature(CPU_FTR_TM))
+ goto bad;
/* We only recheckpoint on return if we're
* transaction.
*/
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 14b0f5b6a373..b5933d7219db 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -750,6 +750,11 @@ SYSCALL_DEFINE0(rt_sigreturn)
if (MSR_TM_ACTIVE(msr)) {
/* We recheckpoint on return. */
struct ucontext __user *uc_transact;
+
+ /* Trying to start TM on non TM system */
+ if (!cpu_has_feature(CPU_FTR_TM))
+ goto badframe;
+
if (__get_user(uc_transact, &uc->uc_link))
goto badframe;
if (restore_tm_sigcontexts(current, &uc->uc_mcontext,
--
cgit 1.2-0.3.lf.el7

View File

@ -1,63 +0,0 @@
From foo@baz Wed 19 Jun 2019 02:34:37 PM CEST
From: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Date: Tue, 11 Jun 2019 17:38:37 +0200
Subject: sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg
Bug-Debian: https://bugs.debian.org/926539
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc.git/commit/?id=07a6d63eb1b54b5fb38092780fe618dfe1d96e23
From: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
[ Upstream commit 07a6d63eb1b54b5fb38092780fe618dfe1d96e23 ]
In d5a2aa24, the name in struct console sunhv_console was changed from "ttyS"
to "ttyHV" while the name in struct uart_ops sunhv_pops remained unchanged.
This results in the hypervisor console device to be listed as "ttyHV0" under
/proc/consoles while the device node is still named "ttyS0":
root@osaka:~# cat /proc/consoles
ttyHV0 -W- (EC p ) 4:64
tty0 -WU (E ) 4:1
root@osaka:~# readlink /sys/dev/char/4:64
../../devices/root/f02836f0/f0285690/tty/ttyS0
root@osaka:~#
This means that any userland code which tries to determine the name of the
device file of the hypervisor console device can not rely on the information
provided by /proc/consoles. In particular, booting current versions of debian-
installer inside a SPARC LDOM will fail with the installer unable to determine
the console device.
After renaming the device in struct uart_ops sunhv_pops to "ttyHV" as well,
the inconsistency is fixed and it is possible again to determine the name
of the device file of the hypervisor console device by reading the contents
of /proc/console:
root@osaka:~# cat /proc/consoles
ttyHV0 -W- (EC p ) 4:64
tty0 -WU (E ) 4:1
root@osaka:~# readlink /sys/dev/char/4:64
../../devices/root/f02836f0/f0285690/tty/ttyHV0
root@osaka:~#
With this change, debian-installer works correctly when installing inside
a SPARC LDOM.
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/tty/serial/sunhv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/serial/sunhv.c
+++ b/drivers/tty/serial/sunhv.c
@@ -397,7 +397,7 @@ static const struct uart_ops sunhv_pops
static struct uart_driver sunhv_reg = {
.owner = THIS_MODULE,
.driver_name = "sunhv",
- .dev_name = "ttyS",
+ .dev_name = "ttyHV",
.major = TTY_MAJOR,
};

View File

@ -1,110 +0,0 @@
From: Borislav Petkov <bp@suse.de>
Date: Wed, 19 Jun 2019 17:24:34 +0200
Subject: x86/cpufeatures: Carve out CQM features retrieval
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=16ad0b63f382a16454cb927f2eb45b32dbb71b94
commit 45fc56e629caa451467e7664fbd4c797c434a6c4 upstream
... into a separate function for better readability. Split out from a
patch from Fenghua Yu <fenghua.yu@intel.com> to keep the mechanical,
sole code movement separate for easy review.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: x86@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/kernel/cpu/common.c | 60 ++++++++++++++++++++----------------
1 file changed, 33 insertions(+), 27 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 1073118b9bf0..a315e475e484 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -808,6 +808,38 @@ static void init_speculation_control(struct cpuinfo_x86 *c)
}
}
+static void init_cqm(struct cpuinfo_x86 *c)
+{
+ u32 eax, ebx, ecx, edx;
+
+ /* Additional Intel-defined flags: level 0x0000000F */
+ if (c->cpuid_level >= 0x0000000F) {
+
+ /* QoS sub-leaf, EAX=0Fh, ECX=0 */
+ cpuid_count(0x0000000F, 0, &eax, &ebx, &ecx, &edx);
+ c->x86_capability[CPUID_F_0_EDX] = edx;
+
+ if (cpu_has(c, X86_FEATURE_CQM_LLC)) {
+ /* will be overridden if occupancy monitoring exists */
+ c->x86_cache_max_rmid = ebx;
+
+ /* QoS sub-leaf, EAX=0Fh, ECX=1 */
+ cpuid_count(0x0000000F, 1, &eax, &ebx, &ecx, &edx);
+ c->x86_capability[CPUID_F_1_EDX] = edx;
+
+ if ((cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC)) ||
+ ((cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL)) ||
+ (cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)))) {
+ c->x86_cache_max_rmid = ecx;
+ c->x86_cache_occ_scale = ebx;
+ }
+ } else {
+ c->x86_cache_max_rmid = -1;
+ c->x86_cache_occ_scale = -1;
+ }
+ }
+}
+
void get_cpu_cap(struct cpuinfo_x86 *c)
{
u32 eax, ebx, ecx, edx;
@@ -839,33 +871,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
c->x86_capability[CPUID_D_1_EAX] = eax;
}
- /* Additional Intel-defined flags: level 0x0000000F */
- if (c->cpuid_level >= 0x0000000F) {
-
- /* QoS sub-leaf, EAX=0Fh, ECX=0 */
- cpuid_count(0x0000000F, 0, &eax, &ebx, &ecx, &edx);
- c->x86_capability[CPUID_F_0_EDX] = edx;
-
- if (cpu_has(c, X86_FEATURE_CQM_LLC)) {
- /* will be overridden if occupancy monitoring exists */
- c->x86_cache_max_rmid = ebx;
-
- /* QoS sub-leaf, EAX=0Fh, ECX=1 */
- cpuid_count(0x0000000F, 1, &eax, &ebx, &ecx, &edx);
- c->x86_capability[CPUID_F_1_EDX] = edx;
-
- if ((cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC)) ||
- ((cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL)) ||
- (cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)))) {
- c->x86_cache_max_rmid = ecx;
- c->x86_cache_occ_scale = ebx;
- }
- } else {
- c->x86_cache_max_rmid = -1;
- c->x86_cache_occ_scale = -1;
- }
- }
-
/* AMD-defined flags: level 0x80000001 */
eax = cpuid_eax(0x80000000);
c->extended_cpuid_level = eax;
@@ -896,6 +901,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
init_scattered_cpuid_features(c);
init_speculation_control(c);
+ init_cqm(c);
/*
* Clear/Set all flags overridden by options, after probe.
--
2.20.1

View File

@ -1,211 +0,0 @@
From: Fenghua Yu <fenghua.yu@intel.com>
Date: Wed, 19 Jun 2019 18:51:09 +0200
Subject: x86/cpufeatures: Combine word 11 and 12 into a new scattered features
word
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b5dd7f61fce44a1d5df5c63ce7bcb9e0a05ce2f7
commit acec0ce081de0c36459eea91647faf99296445a3 upstream
It's a waste for the four X86_FEATURE_CQM_* feature bits to occupy two
whole feature bits words. To better utilize feature words, re-define
word 11 to host scattered features and move the four X86_FEATURE_CQM_*
features into Linux defined word 11. More scattered features can be
added in word 11 in the future.
Rename leaf 11 in cpuid_leafs to CPUID_LNX_4 to reflect it's a
Linux-defined leaf.
Rename leaf 12 as CPUID_DUMMY which will be replaced by a meaningful
name in the next patch when CPUID.7.1:EAX occupies world 12.
Maximum number of RMID and cache occupancy scale are retrieved from
CPUID.0xf.1 after scattered CQM features are enumerated. Carve out the
code into a separate function.
KVM doesn't support resctrl now. So it's safe to move the
X86_FEATURE_CQM_* features to scattered features word 11 for KVM.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Aaron Lewis <aaronlewis@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Babu Moger <babu.moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: x86 <x86@kernel.org>
Link: https://lkml.kernel.org/r/1560794416-217638-2-git-send-email-fenghua.yu@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/include/asm/cpufeature.h | 4 ++--
arch/x86/include/asm/cpufeatures.h | 17 +++++++------
arch/x86/kernel/cpu/common.c | 38 ++++++++++++------------------
arch/x86/kernel/cpu/cpuid-deps.c | 3 +++
arch/x86/kernel/cpu/scattered.c | 4 ++++
arch/x86/kvm/cpuid.h | 2 --
6 files changed, 34 insertions(+), 34 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index ce95b8cbd229..68889ace9c4c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -22,8 +22,8 @@ enum cpuid_leafs
CPUID_LNX_3,
CPUID_7_0_EBX,
CPUID_D_1_EAX,
- CPUID_F_0_EDX,
- CPUID_F_1_EDX,
+ CPUID_LNX_4,
+ CPUID_DUMMY,
CPUID_8000_0008_EBX,
CPUID_6_EAX,
CPUID_8000_000A_EDX,
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 0cf704933f23..5041f19918f2 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -271,13 +271,16 @@
#define X86_FEATURE_XGETBV1 (10*32+ 2) /* XGETBV with ECX = 1 instruction */
#define X86_FEATURE_XSAVES (10*32+ 3) /* XSAVES/XRSTORS instructions */
-/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (EDX), word 11 */
-#define X86_FEATURE_CQM_LLC (11*32+ 1) /* LLC QoS if 1 */
-
-/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (EDX), word 12 */
-#define X86_FEATURE_CQM_OCCUP_LLC (12*32+ 0) /* LLC occupancy monitoring */
-#define X86_FEATURE_CQM_MBM_TOTAL (12*32+ 1) /* LLC Total MBM monitoring */
-#define X86_FEATURE_CQM_MBM_LOCAL (12*32+ 2) /* LLC Local MBM monitoring */
+/*
+ * Extended auxiliary flags: Linux defined - for features scattered in various
+ * CPUID levels like 0xf, etc.
+ *
+ * Reuse free bits when adding new feature flags!
+ */
+#define X86_FEATURE_CQM_LLC (11*32+ 0) /* LLC QoS if 1 */
+#define X86_FEATURE_CQM_OCCUP_LLC (11*32+ 1) /* LLC occupancy monitoring */
+#define X86_FEATURE_CQM_MBM_TOTAL (11*32+ 2) /* LLC Total MBM monitoring */
+#define X86_FEATURE_CQM_MBM_LOCAL (11*32+ 3) /* LLC Local MBM monitoring */
/* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */
#define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a315e475e484..417d09f2bcaf 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -810,33 +810,25 @@ static void init_speculation_control(struct cpuinfo_x86 *c)
static void init_cqm(struct cpuinfo_x86 *c)
{
- u32 eax, ebx, ecx, edx;
-
- /* Additional Intel-defined flags: level 0x0000000F */
- if (c->cpuid_level >= 0x0000000F) {
+ if (!cpu_has(c, X86_FEATURE_CQM_LLC)) {
+ c->x86_cache_max_rmid = -1;
+ c->x86_cache_occ_scale = -1;
+ return;
+ }
- /* QoS sub-leaf, EAX=0Fh, ECX=0 */
- cpuid_count(0x0000000F, 0, &eax, &ebx, &ecx, &edx);
- c->x86_capability[CPUID_F_0_EDX] = edx;
+ /* will be overridden if occupancy monitoring exists */
+ c->x86_cache_max_rmid = cpuid_ebx(0xf);
- if (cpu_has(c, X86_FEATURE_CQM_LLC)) {
- /* will be overridden if occupancy monitoring exists */
- c->x86_cache_max_rmid = ebx;
+ if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) ||
+ cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) ||
+ cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) {
+ u32 eax, ebx, ecx, edx;
- /* QoS sub-leaf, EAX=0Fh, ECX=1 */
- cpuid_count(0x0000000F, 1, &eax, &ebx, &ecx, &edx);
- c->x86_capability[CPUID_F_1_EDX] = edx;
+ /* QoS sub-leaf, EAX=0Fh, ECX=1 */
+ cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx);
- if ((cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC)) ||
- ((cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL)) ||
- (cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)))) {
- c->x86_cache_max_rmid = ecx;
- c->x86_cache_occ_scale = ebx;
- }
- } else {
- c->x86_cache_max_rmid = -1;
- c->x86_cache_occ_scale = -1;
- }
+ c->x86_cache_max_rmid = ecx;
+ c->x86_cache_occ_scale = ebx;
}
}
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 2c0bd38a44ab..fa07a224e7b9 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -59,6 +59,9 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_AVX512_4VNNIW, X86_FEATURE_AVX512F },
{ X86_FEATURE_AVX512_4FMAPS, X86_FEATURE_AVX512F },
{ X86_FEATURE_AVX512_VPOPCNTDQ, X86_FEATURE_AVX512F },
+ { X86_FEATURE_CQM_OCCUP_LLC, X86_FEATURE_CQM_LLC },
+ { X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC },
+ { X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC },
{}
};
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 772c219b6889..5a52672e3f8b 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -21,6 +21,10 @@ struct cpuid_bit {
static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x00000006, 0 },
{ X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 },
+ { X86_FEATURE_CQM_LLC, CPUID_EDX, 1, 0x0000000f, 0 },
+ { X86_FEATURE_CQM_OCCUP_LLC, CPUID_EDX, 0, 0x0000000f, 1 },
+ { X86_FEATURE_CQM_MBM_TOTAL, CPUID_EDX, 1, 0x0000000f, 1 },
+ { X86_FEATURE_CQM_MBM_LOCAL, CPUID_EDX, 2, 0x0000000f, 1 },
{ X86_FEATURE_CAT_L3, CPUID_EBX, 1, 0x00000010, 0 },
{ X86_FEATURE_CAT_L2, CPUID_EBX, 2, 0x00000010, 0 },
{ X86_FEATURE_CDP_L3, CPUID_ECX, 2, 0x00000010, 1 },
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 9a327d5b6d1f..d78a61408243 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -47,8 +47,6 @@ static const struct cpuid_reg reverse_cpuid[] = {
[CPUID_8000_0001_ECX] = {0x80000001, 0, CPUID_ECX},
[CPUID_7_0_EBX] = { 7, 0, CPUID_EBX},
[CPUID_D_1_EAX] = { 0xd, 1, CPUID_EAX},
- [CPUID_F_0_EDX] = { 0xf, 0, CPUID_EDX},
- [CPUID_F_1_EDX] = { 0xf, 1, CPUID_EDX},
[CPUID_8000_0008_EBX] = {0x80000008, 0, CPUID_EBX},
[CPUID_6_EAX] = { 6, 0, CPUID_EAX},
[CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX},
--
2.20.1

View File

@ -1,40 +0,0 @@
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Mon, 15 Jul 2019 11:51:39 -0500
Subject: x86/entry/64: Use JMP instead of JMPQ
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=931b6bfe8af1069fd1a494ef6ab14509ffeacdc3
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-1125
commit 64dbc122b20f75183d8822618c24f85144a5a94d upstream
Somehow the swapgs mitigation entry code patch ended up with a JMPQ
instruction instead of JMP, where only the short jump is needed. Some
assembler versions apparently fail to optimize JMPQ into a two-byte JMP
when possible, instead always using a 7-byte JMP with relocation. For
some reason that makes the entry code explode with a #GP during boot.
Change it back to "JMP" as originally intended.
Fixes: 18ec54fdd6d1 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/entry/entry_64.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 7d8da285e185..ccb5e3486aee 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -612,7 +612,7 @@ ENTRY(interrupt_entry)
UNWIND_HINT_FUNC
movq (%rdi), %rdi
- jmpq 2f
+ jmp 2f
1:
FENCE_SWAPGS_KERNEL_ENTRY
2:
--
2.20.1

View File

@ -1,175 +0,0 @@
From: Jann Horn <jannh@google.com>
Date: Sun, 2 Jun 2019 03:15:58 +0200
Subject: x86/insn-eval: Fix use-after-free access to LDT entry
Origin: https://git.kernel.org/linus/de9f869616dd95e95c00bdd6b0fcd3421e8a4323
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-13233
get_desc() computes a pointer into the LDT while holding a lock that
protects the LDT from being freed, but then drops the lock and returns the
(now potentially dangling) pointer to its caller.
Fix it by giving the caller a copy of the LDT entry instead.
Fixes: 670f928ba09b ("x86/insn-eval: Add utility function to get segment descriptor")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
arch/x86/lib/insn-eval.c | 47 ++++++++++++++++++++++++-----------------------
1 file changed, 24 insertions(+), 23 deletions(-)
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index cf00ab6c6621..306c3a0902ba 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -557,7 +557,8 @@ static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
}
/**
- * get_desc() - Obtain pointer to a segment descriptor
+ * get_desc() - Obtain contents of a segment descriptor
+ * @out: Segment descriptor contents on success
* @sel: Segment selector
*
* Given a segment selector, obtain a pointer to the segment descriptor.
@@ -565,18 +566,18 @@ static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
*
* Returns:
*
- * Pointer to segment descriptor on success.
+ * True on success, false on failure.
*
* NULL on error.
*/
-static struct desc_struct *get_desc(unsigned short sel)
+static bool get_desc(struct desc_struct *out, unsigned short sel)
{
struct desc_ptr gdt_desc = {0, 0};
unsigned long desc_base;
#ifdef CONFIG_MODIFY_LDT_SYSCALL
if ((sel & SEGMENT_TI_MASK) == SEGMENT_LDT) {
- struct desc_struct *desc = NULL;
+ bool success = false;
struct ldt_struct *ldt;
/* Bits [15:3] contain the index of the desired entry. */
@@ -584,12 +585,14 @@ static struct desc_struct *get_desc(unsigned short sel)
mutex_lock(&current->active_mm->context.lock);
ldt = current->active_mm->context.ldt;
- if (ldt && sel < ldt->nr_entries)
- desc = &ldt->entries[sel];
+ if (ldt && sel < ldt->nr_entries) {
+ *out = ldt->entries[sel];
+ success = true;
+ }
mutex_unlock(&current->active_mm->context.lock);
- return desc;
+ return success;
}
#endif
native_store_gdt(&gdt_desc);
@@ -604,9 +607,10 @@ static struct desc_struct *get_desc(unsigned short sel)
desc_base = sel & ~(SEGMENT_RPL_MASK | SEGMENT_TI_MASK);
if (desc_base > gdt_desc.size)
- return NULL;
+ return false;
- return (struct desc_struct *)(gdt_desc.address + desc_base);
+ *out = *(struct desc_struct *)(gdt_desc.address + desc_base);
+ return true;
}
/**
@@ -628,7 +632,7 @@ static struct desc_struct *get_desc(unsigned short sel)
*/
unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
{
- struct desc_struct *desc;
+ struct desc_struct desc;
short sel;
sel = get_segment_selector(regs, seg_reg_idx);
@@ -666,11 +670,10 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
if (!sel)
return -1L;
- desc = get_desc(sel);
- if (!desc)
+ if (!get_desc(&desc, sel))
return -1L;
- return get_desc_base(desc);
+ return get_desc_base(&desc);
}
/**
@@ -692,7 +695,7 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
*/
static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
{
- struct desc_struct *desc;
+ struct desc_struct desc;
unsigned long limit;
short sel;
@@ -706,8 +709,7 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
if (!sel)
return 0;
- desc = get_desc(sel);
- if (!desc)
+ if (!get_desc(&desc, sel))
return 0;
/*
@@ -716,8 +718,8 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
* not tested when checking the segment limits. In practice,
* this means that the segment ends in (limit << 12) + 0xfff.
*/
- limit = get_desc_limit(desc);
- if (desc->g)
+ limit = get_desc_limit(&desc);
+ if (desc.g)
limit = (limit << 12) + 0xfff;
return limit;
@@ -741,7 +743,7 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
*/
int insn_get_code_seg_params(struct pt_regs *regs)
{
- struct desc_struct *desc;
+ struct desc_struct desc;
short sel;
if (v8086_mode(regs))
@@ -752,8 +754,7 @@ int insn_get_code_seg_params(struct pt_regs *regs)
if (sel < 0)
return sel;
- desc = get_desc(sel);
- if (!desc)
+ if (!get_desc(&desc, sel))
return -EINVAL;
/*
@@ -761,10 +762,10 @@ int insn_get_code_seg_params(struct pt_regs *regs)
* determines whether a segment contains data or code. If this is a data
* segment, return error.
*/
- if (!(desc->type & BIT(3)))
+ if (!(desc.type & BIT(3)))
return -EINVAL;
- switch ((desc->l << 1) | desc->d) {
+ switch ((desc.l << 1) | desc.d) {
case 0: /*
* Legacy mode. CS.L=0, CS.D=0. Address and operand size are
* both 16-bit.
--
cgit 1.2-0.3.lf.el7

View File

@ -1,261 +0,0 @@
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Mon, 8 Jul 2019 11:52:26 -0500
Subject: x86/speculation: Enable Spectre v1 swapgs mitigations
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=23e7a7b3a75f6dd24c161bf7d1399f251bf5c109
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-1125
commit a2059825986a1c8143fd6698774fa9d83733bb11 upstream
The previous commit added macro calls in the entry code which mitigate the
Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
enabled. Enable those features where applicable.
The mitigations may be disabled with "nospectre_v1" or "mitigations=off".
There are different features which can affect the risk of attack:
- When FSGSBASE is enabled, unprivileged users are able to place any
value in GS, using the wrgsbase instruction. This means they can
write a GS value which points to any value in kernel space, which can
be useful with the following gadget in an interrupt/exception/NMI
handler:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
// dependent load or store based on the value of %reg
// for example: mov %(reg1), %reg2
If an interrupt is coming from user space, and the entry code
speculatively skips the swapgs (due to user branch mistraining), it
may speculatively execute the GS-based load and a subsequent dependent
load or store, exposing the kernel data to an L1 side channel leak.
Note that, on Intel, a similar attack exists in the above gadget when
coming from kernel space, if the swapgs gets speculatively executed to
switch back to the user GS. On AMD, this variant isn't possible
because swapgs is serializing with respect to future GS-based
accesses.
NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
doesn't exist quite yet.
- When FSGSBASE is disabled, the issue is mitigated somewhat because
unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
restricts GS values to user space addresses only. That means the
gadget would need an additional step, since the target kernel address
needs to be read from user space first. Something like:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
mov (%reg1), %reg2
// dependent load or store based on the value of %reg2
// for example: mov %(reg2), %reg3
It's difficult to audit for this gadget in all the handlers, so while
there are no known instances of it, it's entirely possible that it
exists somewhere (or could be introduced in the future). Without
tooling to analyze all such code paths, consider it vulnerable.
Effects of SMAP on the !FSGSBASE case:
- If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
susceptible to Meltdown), the kernel is prevented from speculatively
reading user space memory, even L1 cached values. This effectively
disables the !FSGSBASE attack vector.
- If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
still prevents the kernel from speculatively reading user space
memory. But it does *not* prevent the kernel from reading the
user value from L1, if it has already been cached. This is probably
only a small hurdle for an attacker to overcome.
Thanks to Dave Hansen for contributing the speculative_smap() function.
Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
is serializing on AMD.
[ tglx: Fixed the USER fence decision and polished the comment as suggested
by Dave Hansen ]
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
.../admin-guide/kernel-parameters.txt | 7 +-
arch/x86/kernel/cpu/bugs.c | 115 ++++++++++++++++--
2 files changed, 110 insertions(+), 12 deletions(-)
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2515,6 +2515,7 @@
Equivalent to: nopti [X86,PPC]
nospectre_v1 [PPC]
nobp=0 [S390]
+ nospectre_v1 [X86]
nospectre_v2 [X86,PPC,S390]
spectre_v2_user=off [X86]
spec_store_bypass_disable=off [X86,PPC]
@@ -2861,9 +2862,9 @@
nosmt=force: Force disable SMT, cannot be undone
via the sysfs control file.
- nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds
- check bypass). With this option data leaks are possible
- in the system.
+ nospectre_v1 [X66, PPC] Disable mitigations for Spectre Variant 1
+ (bounds check bypass). With this option data leaks
+ are possible in the system.
nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2
(indirect branch prediction) vulnerability. System may
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -32,6 +32,7 @@
#include <asm/e820/api.h>
#include <asm/hypervisor.h>
+static void __init spectre_v1_select_mitigation(void);
static void __init spectre_v2_select_mitigation(void);
static void __init ssb_select_mitigation(void);
static void __init l1tf_select_mitigation(void);
@@ -96,17 +97,11 @@ void __init check_bugs(void)
if (boot_cpu_has(X86_FEATURE_STIBP))
x86_spec_ctrl_mask |= SPEC_CTRL_STIBP;
- /* Select the proper spectre mitigation before patching alternatives */
+ /* Select the proper CPU mitigations before patching alternatives: */
+ spectre_v1_select_mitigation();
spectre_v2_select_mitigation();
-
- /*
- * Select proper mitigation for any exposure to the Speculative Store
- * Bypass vulnerability.
- */
ssb_select_mitigation();
-
l1tf_select_mitigation();
-
mds_select_mitigation();
arch_smt_update();
@@ -272,6 +267,108 @@ static int __init mds_cmdline(char *str)
early_param("mds", mds_cmdline);
#undef pr_fmt
+#define pr_fmt(fmt) "Spectre V1 : " fmt
+
+enum spectre_v1_mitigation {
+ SPECTRE_V1_MITIGATION_NONE,
+ SPECTRE_V1_MITIGATION_AUTO,
+};
+
+static enum spectre_v1_mitigation spectre_v1_mitigation __ro_after_init =
+ SPECTRE_V1_MITIGATION_AUTO;
+
+static const char * const spectre_v1_strings[] = {
+ [SPECTRE_V1_MITIGATION_NONE] = "Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers",
+ [SPECTRE_V1_MITIGATION_AUTO] = "Mitigation: usercopy/swapgs barriers and __user pointer sanitization",
+};
+
+static bool is_swapgs_serializing(void)
+{
+ /*
+ * Technically, swapgs isn't serializing on AMD (despite it previously
+ * being documented as such in the APM). But according to AMD, %gs is
+ * updated non-speculatively, and the issuing of %gs-relative memory
+ * operands will be blocked until the %gs update completes, which is
+ * good enough for our purposes.
+ */
+ return boot_cpu_data.x86_vendor == X86_VENDOR_AMD;
+}
+
+/*
+ * Does SMAP provide full mitigation against speculative kernel access to
+ * userspace?
+ */
+static bool smap_works_speculatively(void)
+{
+ if (!boot_cpu_has(X86_FEATURE_SMAP))
+ return false;
+
+ /*
+ * On CPUs which are vulnerable to Meltdown, SMAP does not
+ * prevent speculative access to user data in the L1 cache.
+ * Consider SMAP to be non-functional as a mitigation on these
+ * CPUs.
+ */
+ if (boot_cpu_has(X86_BUG_CPU_MELTDOWN))
+ return false;
+
+ return true;
+}
+
+static void __init spectre_v1_select_mitigation(void)
+{
+ if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1) || cpu_mitigations_off()) {
+ spectre_v1_mitigation = SPECTRE_V1_MITIGATION_NONE;
+ return;
+ }
+
+ if (spectre_v1_mitigation == SPECTRE_V1_MITIGATION_AUTO) {
+ /*
+ * With Spectre v1, a user can speculatively control either
+ * path of a conditional swapgs with a user-controlled GS
+ * value. The mitigation is to add lfences to both code paths.
+ *
+ * If FSGSBASE is enabled, the user can put a kernel address in
+ * GS, in which case SMAP provides no protection.
+ *
+ * [ NOTE: Don't check for X86_FEATURE_FSGSBASE until the
+ * FSGSBASE enablement patches have been merged. ]
+ *
+ * If FSGSBASE is disabled, the user can only put a user space
+ * address in GS. That makes an attack harder, but still
+ * possible if there's no SMAP protection.
+ */
+ if (!smap_works_speculatively()) {
+ /*
+ * Mitigation can be provided from SWAPGS itself or
+ * PTI as the CR3 write in the Meltdown mitigation
+ * is serializing.
+ *
+ * If neither is there, mitigate with an LFENCE.
+ */
+ if (!is_swapgs_serializing() && !boot_cpu_has(X86_FEATURE_PTI))
+ setup_force_cpu_cap(X86_FEATURE_FENCE_SWAPGS_USER);
+
+ /*
+ * Enable lfences in the kernel entry (non-swapgs)
+ * paths, to prevent user entry from speculatively
+ * skipping swapgs.
+ */
+ setup_force_cpu_cap(X86_FEATURE_FENCE_SWAPGS_KERNEL);
+ }
+ }
+
+ pr_info("%s\n", spectre_v1_strings[spectre_v1_mitigation]);
+}
+
+static int __init nospectre_v1_cmdline(char *str)
+{
+ spectre_v1_mitigation = SPECTRE_V1_MITIGATION_NONE;
+ return 0;
+}
+early_param("nospectre_v1", nospectre_v1_cmdline);
+
+#undef pr_fmt
#define pr_fmt(fmt) "Spectre V2 : " fmt
static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init =
@@ -1249,7 +1346,7 @@ static ssize_t cpu_show_common(struct de
break;
case X86_BUG_SPECTRE_V1:
- return sprintf(buf, "Mitigation: __user pointer sanitization\n");
+ return sprintf(buf, "%s\n", spectre_v1_strings[spectre_v1_mitigation]);
case X86_BUG_SPECTRE_V2:
return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],

View File

@ -1,200 +0,0 @@
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Mon, 8 Jul 2019 11:52:25 -0500
Subject: x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=befb822c062b4c3d93380a58d5fd479395e8b267
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-1125
commit 18ec54fdd6d18d92025af097cd042a75cf0ea24c upstream
Spectre v1 isn't only about array bounds checks. It can affect any
conditional checks. The kernel entry code interrupt, exception, and NMI
handlers all have conditional swapgs checks. Those may be problematic in
the context of Spectre v1, as kernel code can speculatively run with a user
GS.
For example:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg
mov (%reg), %reg1
When coming from user space, the CPU can speculatively skip the swapgs, and
then do a speculative percpu load using the user GS value. So the user can
speculatively force a read of any kernel value. If a gadget exists which
uses the percpu value as an address in another load/store, then the
contents of the kernel value may become visible via an L1 side channel
attack.
A similar attack exists when coming from kernel space. The CPU can
speculatively do the swapgs, causing the user GS to get used for the rest
of the speculative window.
The mitigation is similar to a traditional Spectre v1 mitigation, except:
a) index masking isn't possible; because the index (percpu offset)
isn't user-controlled; and
b) an lfence is needed in both the "from user" swapgs path and the
"from kernel" non-swapgs path (because of the two attacks described
above).
The user entry swapgs paths already have SWITCH_TO_KERNEL_CR3, which has a
CR3 write when PTI is enabled. Since CR3 writes are serializing, the
lfences can be skipped in those cases.
On the other hand, the kernel entry swapgs paths don't depend on PTI.
To avoid unnecessary lfences for the user entry case, create two separate
features for alternative patching:
X86_FEATURE_FENCE_SWAPGS_USER
X86_FEATURE_FENCE_SWAPGS_KERNEL
Use these features in entry code to patch in lfences where needed.
The features aren't enabled yet, so there's no functional change.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/entry/calling.h | 17 +++++++++++++++++
arch/x86/entry/entry_64.S | 21 ++++++++++++++++++---
arch/x86/include/asm/cpufeatures.h | 2 ++
3 files changed, 37 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index e699b2041665..578b5455334f 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -329,6 +329,23 @@ For 32-bit we have the following conventions - kernel is built with
#endif
+/*
+ * Mitigate Spectre v1 for conditional swapgs code paths.
+ *
+ * FENCE_SWAPGS_USER_ENTRY is used in the user entry swapgs code path, to
+ * prevent a speculative swapgs when coming from kernel space.
+ *
+ * FENCE_SWAPGS_KERNEL_ENTRY is used in the kernel entry non-swapgs code path,
+ * to prevent the swapgs from getting speculatively skipped when coming from
+ * user space.
+ */
+.macro FENCE_SWAPGS_USER_ENTRY
+ ALTERNATIVE "", "lfence", X86_FEATURE_FENCE_SWAPGS_USER
+.endm
+.macro FENCE_SWAPGS_KERNEL_ENTRY
+ ALTERNATIVE "", "lfence", X86_FEATURE_FENCE_SWAPGS_KERNEL
+.endm
+
#endif /* CONFIG_X86_64 */
/*
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index e7572a209fbe..7d8da285e185 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -582,7 +582,7 @@ ENTRY(interrupt_entry)
testb $3, CS-ORIG_RAX+8(%rsp)
jz 1f
SWAPGS
-
+ FENCE_SWAPGS_USER_ENTRY
/*
* Switch to the thread stack. The IRET frame and orig_ax are
* on the stack, as well as the return address. RDI..R12 are
@@ -612,8 +612,10 @@ ENTRY(interrupt_entry)
UNWIND_HINT_FUNC
movq (%rdi), %rdi
+ jmpq 2f
1:
-
+ FENCE_SWAPGS_KERNEL_ENTRY
+2:
PUSH_AND_CLEAR_REGS save_ret=1
ENCODE_FRAME_POINTER 8
@@ -1240,6 +1242,13 @@ ENTRY(paranoid_entry)
*/
SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14
+ /*
+ * The above SAVE_AND_SWITCH_TO_KERNEL_CR3 macro doesn't do an
+ * unconditional CR3 write, even in the PTI case. So do an lfence
+ * to prevent GS speculation, regardless of whether PTI is enabled.
+ */
+ FENCE_SWAPGS_KERNEL_ENTRY
+
ret
END(paranoid_entry)
@@ -1290,6 +1299,7 @@ ENTRY(error_entry)
* from user mode due to an IRET fault.
*/
SWAPGS
+ FENCE_SWAPGS_USER_ENTRY
/* We have user CR3. Change to kernel CR3. */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
@@ -1311,6 +1321,8 @@ ENTRY(error_entry)
CALL_enter_from_user_mode
ret
+.Lerror_entry_done_lfence:
+ FENCE_SWAPGS_KERNEL_ENTRY
.Lerror_entry_done:
TRACE_IRQS_OFF
ret
@@ -1329,7 +1341,7 @@ ENTRY(error_entry)
cmpq %rax, RIP+8(%rsp)
je .Lbstep_iret
cmpq $.Lgs_change, RIP+8(%rsp)
- jne .Lerror_entry_done
+ jne .Lerror_entry_done_lfence
/*
* hack: .Lgs_change can fail with user gsbase. If this happens, fix up
@@ -1337,6 +1349,7 @@ ENTRY(error_entry)
* .Lgs_change's error handler with kernel gsbase.
*/
SWAPGS
+ FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
jmp .Lerror_entry_done
@@ -1351,6 +1364,7 @@ ENTRY(error_entry)
* gsbase and CR3. Switch to kernel gsbase and CR3:
*/
SWAPGS
+ FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
/*
@@ -1442,6 +1456,7 @@ ENTRY(nmi)
swapgs
cld
+ FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
movq %rsp, %rdx
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 5041f19918f2..e0f47f6a1017 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -281,6 +281,8 @@
#define X86_FEATURE_CQM_OCCUP_LLC (11*32+ 1) /* LLC occupancy monitoring */
#define X86_FEATURE_CQM_MBM_TOTAL (11*32+ 2) /* LLC Total MBM monitoring */
#define X86_FEATURE_CQM_MBM_LOCAL (11*32+ 3) /* LLC Local MBM monitoring */
+#define X86_FEATURE_FENCE_SWAPGS_USER (11*32+ 4) /* "" LFENCE in user entry SWAPGS path */
+#define X86_FEATURE_FENCE_SWAPGS_KERNEL (11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */
/* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */
#define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */
--
2.20.1

View File

@ -1,159 +0,0 @@
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 17 Jul 2019 21:18:59 +0200
Subject: x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b88241aef6f1654417bb281546da316ffab57807
Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2019-1125
commit f36cf386e3fec258a341d446915862eded3e13d8 upstream
Intel provided the following information:
On all current Atom processors, instructions that use a segment register
value (e.g. a load or store) will not speculatively execute before the
last writer of that segment retires. Thus they will not use a
speculatively written segment value.
That means on ATOMs there is no speculation through SWAPGS, so the SWAPGS
entry paths can be excluded from the extra LFENCE if PTI is disabled.
Create a separate bug flag for the through SWAPGS speculation and mark all
out-of-order ATOMs and AMD/HYGON CPUs as not affected. The in-order ATOMs
are excluded from the whole mitigation mess anyway.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/bugs.c | 18 +++----------
arch/x86/kernel/cpu/common.c | 42 +++++++++++++++++++-----------
3 files changed, 32 insertions(+), 29 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index e0f47f6a1017..759f0a176612 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -388,5 +388,6 @@
#define X86_BUG_L1TF X86_BUG(18) /* CPU is affected by L1 Terminal Fault */
#define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
#define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */
+#define X86_BUG_SWAPGS X86_BUG(21) /* CPU is affected by speculation through SWAPGS */
#endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 844ad5d3ef51..ee7d17611ead 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -282,18 +282,6 @@ static const char * const spectre_v1_strings[] = {
[SPECTRE_V1_MITIGATION_AUTO] = "Mitigation: usercopy/swapgs barriers and __user pointer sanitization",
};
-static bool is_swapgs_serializing(void)
-{
- /*
- * Technically, swapgs isn't serializing on AMD (despite it previously
- * being documented as such in the APM). But according to AMD, %gs is
- * updated non-speculatively, and the issuing of %gs-relative memory
- * operands will be blocked until the %gs update completes, which is
- * good enough for our purposes.
- */
- return boot_cpu_data.x86_vendor == X86_VENDOR_AMD;
-}
-
/*
* Does SMAP provide full mitigation against speculative kernel access to
* userspace?
@@ -344,9 +332,11 @@ static void __init spectre_v1_select_mitigation(void)
* PTI as the CR3 write in the Meltdown mitigation
* is serializing.
*
- * If neither is there, mitigate with an LFENCE.
+ * If neither is there, mitigate with an LFENCE to
+ * stop speculation through swapgs.
*/
- if (!is_swapgs_serializing() && !boot_cpu_has(X86_FEATURE_PTI))
+ if (boot_cpu_has_bug(X86_BUG_SWAPGS) &&
+ !boot_cpu_has(X86_FEATURE_PTI))
setup_force_cpu_cap(X86_FEATURE_FENCE_SWAPGS_USER);
/*
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 417d09f2bcaf..b33fdfa0ff49 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -952,6 +952,7 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
#define NO_L1TF BIT(3)
#define NO_MDS BIT(4)
#define MSBDS_ONLY BIT(5)
+#define NO_SWAPGS BIT(6)
#define VULNWL(_vendor, _family, _model, _whitelist) \
{ X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist }
@@ -975,29 +976,37 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION),
VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION),
- VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY),
- VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF | MSBDS_ONLY),
- VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY),
- VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY),
- VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY),
- VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
+ VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
+ VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
VULNWL_INTEL(CORE_YONAH, NO_SSB),
- VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY),
+ VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF),
- VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF),
- VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF),
+ VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS),
+
+ /*
+ * Technically, swapgs isn't serializing on AMD (despite it previously
+ * being documented as such in the APM). But according to AMD, %gs is
+ * updated non-speculatively, and the issuing of %gs-relative memory
+ * operands will be blocked until the %gs update completes, which is
+ * good enough for our purposes.
+ */
/* AMD Family 0xf - 0x12 */
- VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
- VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
- VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
- VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+ VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
+ VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
+ VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
+ VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
/* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
- VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS),
+ VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS),
{}
};
@@ -1034,6 +1043,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
setup_force_cpu_bug(X86_BUG_MSBDS_ONLY);
}
+ if (!cpu_matches(NO_SWAPGS))
+ setup_force_cpu_bug(X86_BUG_SWAPGS);
+
if (cpu_matches(NO_MELTDOWN))
return;
--
2.20.1

View File

@ -9,12 +9,10 @@ Patch headers added by debian/patches/features/all/aufs4/gen-patch
SPDX-License-Identifier: GPL-2.0
aufs4.x-rcN mmap patch
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ccf86f1..a44e9d9 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2002,7 +2002,7 @@ static int map_files_get_link(struct dentry *dentry, struct path *path)
down_read(&mm->mmap_sem);
@@ -2036,7 +2036,7 @@ static int map_files_get_link(struct den
rc = -ENOENT;
vma = find_exact_vma(mm, vm_start, vm_end);
if (vma && vma->vm_file) {
- *path = vma->vm_file->f_path;
@ -22,11 +20,9 @@ index ccf86f1..a44e9d9 100644
path_get(path);
rc = 0;
}
diff --git a/fs/proc/nommu.c b/fs/proc/nommu.c
index 3b63be6..fb9913b 100644
--- a/fs/proc/nommu.c
+++ b/fs/proc/nommu.c
@@ -45,7 +45,10 @@ static int nommu_region_show(struct seq_file *m, struct vm_region *region)
@@ -45,7 +45,10 @@ static int nommu_region_show(struct seq_
file = region->vm_file;
if (file) {
@ -38,11 +34,9 @@ index 3b63be6..fb9913b 100644
dev = inode->i_sb->s_dev;
ino = inode->i_ino;
}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 5ea1d64..7865a470 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -305,7 +305,10 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma)
@@ -309,7 +309,10 @@ show_map_vma(struct seq_file *m, struct
const char *name = NULL;
if (file) {
@ -54,7 +48,7 @@ index 5ea1d64..7865a470 100644
dev = inode->i_sb->s_dev;
ino = inode->i_ino;
pgoff = ((loff_t)vma->vm_pgoff) << PAGE_SHIFT;
@@ -1727,7 +1730,7 @@ static int show_numa_map(struct seq_file *m, void *v)
@@ -1766,7 +1769,7 @@ static int show_numa_map(struct seq_file
struct proc_maps_private *proc_priv = &numa_priv->proc_maps;
struct vm_area_struct *vma = v;
struct numa_maps *md = &numa_priv->md;
@ -63,11 +57,9 @@ index 5ea1d64..7865a470 100644
struct mm_struct *mm = vma->vm_mm;
struct mm_walk walk = {
.hugetlb_entry = gather_hugetlb_stats,
diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c
index 0b63d68..400d1c5 100644
--- a/fs/proc/task_nommu.c
+++ b/fs/proc/task_nommu.c
@@ -155,7 +155,10 @@ static int nommu_vma_show(struct seq_file *m, struct vm_area_struct *vma)
@@ -155,7 +155,10 @@ static int nommu_vma_show(struct seq_fil
file = vma->vm_file;
if (file) {
@ -79,11 +71,9 @@ index 0b63d68..400d1c5 100644
dev = inode->i_sb->s_dev;
ino = inode->i_ino;
pgoff = (loff_t)vma->vm_pgoff << PAGE_SHIFT;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a61ebe8..111f031 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1440,6 +1440,28 @@ static inline void unmap_shared_mapping_range(struct address_space *mapping,
@@ -1453,6 +1453,28 @@ static inline void unmap_shared_mapping_
unmap_mapping_range(mapping, holebegin, holelen, 0);
}
@ -112,8 +102,6 @@ index a61ebe8..111f031 100644
extern int access_process_vm(struct task_struct *tsk, unsigned long addr,
void *buf, int len, unsigned int gup_flags);
extern int access_remote_vm(struct mm_struct *mm, unsigned long addr,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index cd2bc93..e499e57 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -239,6 +239,7 @@ struct vm_region {
@ -132,11 +120,9 @@ index cd2bc93..e499e57 100644
void * vm_private_data; /* was vm_pte (shared mem) */
atomic_long_t swap_readahead_info;
diff --git a/kernel/fork.c b/kernel/fork.c
index d896e9c..ea800e9 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -505,7 +505,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
@@ -505,7 +505,7 @@ static __latent_entropy int dup_mmap(str
struct inode *inode = file_inode(file);
struct address_space *mapping = file->f_mapping;
@ -145,11 +131,9 @@ index d896e9c..ea800e9 100644
if (tmp->vm_flags & VM_DENYWRITE)
atomic_dec(&inode->i_writecount);
i_mmap_lock_write(mapping);
diff --git a/mm/Makefile b/mm/Makefile
index 8716bda..68afd6d 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -39,7 +39,7 @@ obj-y := filemap.o mempool.o oom_kill.o \
@@ -39,7 +39,7 @@ obj-y := filemap.o mempool.o oom_kill.
mm_init.o mmu_context.o percpu.o slab_common.o \
compaction.o vmacache.o \
interval_tree.o list_lru.o workingset.o \
@ -158,11 +142,9 @@ index 8716bda..68afd6d 100644
obj-y += init-mm.o
diff --git a/mm/filemap.c b/mm/filemap.c
index 52517f2..250f675 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2700,7 +2700,7 @@ vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf)
@@ -2722,7 +2722,7 @@ vm_fault_t filemap_page_mkwrite(struct v
vm_fault_t ret = VM_FAULT_LOCKED;
sb_start_pagefault(inode->i_sb);
@ -171,11 +153,9 @@ index 52517f2..250f675 100644
lock_page(page);
if (page->mapping != inode->i_mapping) {
unlock_page(page);
diff --git a/mm/mmap.c b/mm/mmap.c
index 5f2b2b1..d71330c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -180,7 +180,7 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
@@ -181,7 +181,7 @@ static struct vm_area_struct *remove_vma
if (vma->vm_ops && vma->vm_ops->close)
vma->vm_ops->close(vma);
if (vma->vm_file)
@ -184,7 +164,7 @@ index 5f2b2b1..d71330c 100644
mpol_put(vma_policy(vma));
vm_area_free(vma);
return next;
@@ -905,7 +905,7 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start,
@@ -906,7 +906,7 @@ again:
if (remove_next) {
if (file) {
uprobe_munmap(next, next->vm_start, next->vm_end);
@ -193,7 +173,7 @@ index 5f2b2b1..d71330c 100644
}
if (next->anon_vma)
anon_vma_merge(vma, next);
@@ -1821,8 +1821,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
@@ -1822,8 +1822,8 @@ out:
return addr;
unmap_and_free_vma:
@ -203,7 +183,7 @@ index 5f2b2b1..d71330c 100644
/* Undo any partial mapping done by a device driver. */
unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end);
@@ -2641,7 +2641,7 @@ int __split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
@@ -2645,7 +2645,7 @@ int __split_vma(struct mm_struct *mm, st
goto out_free_mpol;
if (new->vm_file)
@ -212,7 +192,7 @@ index 5f2b2b1..d71330c 100644
if (new->vm_ops && new->vm_ops->open)
new->vm_ops->open(new);
@@ -2660,7 +2660,7 @@ int __split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
@@ -2664,7 +2664,7 @@ int __split_vma(struct mm_struct *mm, st
if (new->vm_ops && new->vm_ops->close)
new->vm_ops->close(new);
if (new->vm_file)
@ -221,7 +201,7 @@ index 5f2b2b1..d71330c 100644
unlink_anon_vmas(new);
out_free_mpol:
mpol_put(vma_policy(new));
@@ -2822,7 +2822,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
@@ -2826,7 +2826,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsign
struct vm_area_struct *vma;
unsigned long populate = 0;
unsigned long ret = -EINVAL;
@ -230,7 +210,7 @@ index 5f2b2b1..d71330c 100644
pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/vm/remap_file_pages.rst.\n",
current->comm, current->pid);
@@ -2897,10 +2897,27 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
@@ -2901,10 +2901,27 @@ SYSCALL_DEFINE5(remap_file_pages, unsign
}
}
@ -259,7 +239,7 @@ index 5f2b2b1..d71330c 100644
out:
up_write(&mm->mmap_sem);
if (populate)
@@ -3206,7 +3223,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
@@ -3210,7 +3227,7 @@ struct vm_area_struct *copy_vma(struct v
if (anon_vma_clone(new_vma, vma))
goto out_free_mempol;
if (new_vma->vm_file)
@ -268,11 +248,9 @@ index 5f2b2b1..d71330c 100644
if (new_vma->vm_ops && new_vma->vm_ops->open)
new_vma->vm_ops->open(new_vma);
vma_link(mm, new_vma, prev, rb_link, rb_parent);
diff --git a/mm/nommu.c b/mm/nommu.c
index e4aac33..b27b200 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -625,7 +625,7 @@ static void __put_nommu_region(struct vm_region *region)
@@ -625,7 +625,7 @@ static void __put_nommu_region(struct vm
up_write(&nommu_region_sem);
if (region->vm_file)
@ -281,7 +259,7 @@ index e4aac33..b27b200 100644
/* IO memory and memory shared directly out of the pagecache
* from ramfs/tmpfs mustn't be released here */
@@ -763,7 +763,7 @@ static void delete_vma(struct mm_struct *mm, struct vm_area_struct *vma)
@@ -763,7 +763,7 @@ static void delete_vma(struct mm_struct
if (vma->vm_ops && vma->vm_ops->close)
vma->vm_ops->close(vma);
if (vma->vm_file)
@ -299,7 +277,7 @@ index e4aac33..b27b200 100644
kmem_cache_free(vm_region_jar, region);
region = pregion;
result = start;
@@ -1361,7 +1361,7 @@ unsigned long do_mmap(struct file *file,
@@ -1361,7 +1361,7 @@ error_just_free:
up_write(&nommu_region_sem);
error:
if (region->vm_file)
@ -308,9 +286,6 @@ index e4aac33..b27b200 100644
kmem_cache_free(vm_region_jar, region);
if (vma->vm_file)
fput(vma->vm_file);
diff --git a/mm/prfile.c b/mm/prfile.c
new file mode 100644
index 0000000..a27ac36
--- /dev/null
+++ b/mm/prfile.c
@@ -0,0 +1,86 @@

View File

@ -15,8 +15,6 @@ Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/amazon/ena/ena_netdev.c | 10 ++++--
4 files changed, 43 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
index 4532e574ebcd..d735164efea3 100644
--- a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
@@ -63,6 +63,8 @@ enum ena_admin_aq_completion_status {
@ -68,7 +66,7 @@ index 4532e574ebcd..d735164efea3 100644
};
struct ena_admin_rss_ind_table_entry {
@@ -1008,6 +1030,13 @@ struct ena_admin_ena_mmio_req_read_less_resp {
@@ -1008,6 +1030,13 @@ struct ena_admin_ena_mmio_req_read_less_
#define ENA_ADMIN_HOST_INFO_MINOR_MASK GENMASK(15, 8)
#define ENA_ADMIN_HOST_INFO_SUB_MINOR_SHIFT 16
#define ENA_ADMIN_HOST_INFO_SUB_MINOR_MASK GENMASK(23, 16)
@ -82,8 +80,6 @@ index 4532e574ebcd..d735164efea3 100644
/* aenq_common_desc */
#define ENA_ADMIN_AENQ_COMMON_DESC_PHASE_MASK BIT(0)
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 7635c38e77dd..b6e6a4721931 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -41,9 +41,6 @@
@ -96,7 +92,7 @@ index 7635c38e77dd..b6e6a4721931 100644
#define ENA_CTRL_MAJOR 0
#define ENA_CTRL_MINOR 0
@@ -1400,11 +1397,6 @@ int ena_com_validate_version(struct ena_com_dev *ena_dev)
@@ -1400,11 +1397,6 @@ int ena_com_validate_version(struct ena_
ENA_REGS_VERSION_MAJOR_VERSION_SHIFT,
ver & ENA_REGS_VERSION_MINOR_VERSION_MASK);
@ -108,7 +104,7 @@ index 7635c38e77dd..b6e6a4721931 100644
pr_info("ena controller version: %d.%d.%d implementation version %d\n",
(ctrl_ver & ENA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_MASK) >>
ENA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_SHIFT,
@@ -2441,6 +2433,10 @@ int ena_com_allocate_host_info(struct ena_com_dev *ena_dev)
@@ -2441,6 +2433,10 @@ int ena_com_allocate_host_info(struct en
if (unlikely(!host_attr->host_info))
return -ENOMEM;
@ -119,8 +115,6 @@ index 7635c38e77dd..b6e6a4721931 100644
return 0;
}
diff --git a/drivers/net/ethernet/amazon/ena/ena_common_defs.h b/drivers/net/ethernet/amazon/ena/ena_common_defs.h
index bb8d73676eab..23beb7e7ed7b 100644
--- a/drivers/net/ethernet/amazon/ena/ena_common_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_common_defs.h
@@ -32,8 +32,8 @@
@ -134,11 +128,9 @@ index bb8d73676eab..23beb7e7ed7b 100644
/* ENA operates with 48-bit memory addresses. ena_mem_addr_t */
struct ena_common_mem_addr {
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 69a49784b204..0c9c0d3ce856 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2206,7 +2206,8 @@ static u16 ena_select_queue(struct net_device *dev, struct sk_buff *skb,
@@ -2206,7 +2206,8 @@ static u16 ena_select_queue(struct net_d
return qid;
}
@ -148,15 +140,15 @@ index 69a49784b204..0c9c0d3ce856 100644
{
struct ena_admin_host_info *host_info;
int rc;
@@ -2220,6 +2221,7 @@ static void ena_config_host_info(struct ena_com_dev *ena_dev)
@@ -2220,6 +2221,7 @@ static void ena_config_host_info(struct
host_info = ena_dev->host_attr.host_info;
+ host_info->bdf = (pdev->bus->number << 8) | pdev->devfn;
host_info->os_type = ENA_ADMIN_OS_LINUX;
host_info->kernel_ver = LINUX_VERSION_CODE;
strncpy(host_info->kernel_ver_str, utsname()->version,
@@ -2230,7 +2232,9 @@ static void ena_config_host_info(struct ena_com_dev *ena_dev)
strlcpy(host_info->kernel_ver_str, utsname()->version,
@@ -2230,7 +2232,9 @@ static void ena_config_host_info(struct
host_info->driver_version =
(DRV_MODULE_VER_MAJOR) |
(DRV_MODULE_VER_MINOR << ENA_ADMIN_HOST_INFO_MINOR_SHIFT) |
@ -167,7 +159,7 @@ index 69a49784b204..0c9c0d3ce856 100644
rc = ena_com_set_host_attributes(ena_dev);
if (rc) {
@@ -2454,7 +2458,7 @@ static int ena_device_init(struct ena_com_dev *ena_dev, struct pci_dev *pdev,
@@ -2454,7 +2458,7 @@ static int ena_device_init(struct ena_co
*/
ena_com_set_admin_polling_mode(ena_dev, true);
@ -176,6 +168,3 @@ index 69a49784b204..0c9c0d3ce856 100644
/* Get Device Attributes*/
rc = ena_com_get_dev_attr_feat(ena_dev, get_feat_ctx);
--
2.19.2

View File

@ -22,11 +22,9 @@ cc: linux-efi@vger.kernel.org
4 files changed, 50 insertions(+), 19 deletions(-)
create mode 100644 drivers/firmware/efi/secureboot.c
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 0957dd73d127..7c2162f9e769 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1197,19 +1197,7 @@ void __init setup_arch(char **cmdline_p)
@@ -1159,19 +1159,7 @@ void __init setup_arch(char **cmdline_p)
/* Allocate bigger log buffer */
setup_log_buf(1);
@ -47,11 +45,9 @@ index 0957dd73d127..7c2162f9e769 100644
reserve_initrd();
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index 0329d319d89a..883f9f7eefc6 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_EFI_FAKE_MEMMAP) += fake_mem.o
@@ -24,6 +24,7 @@ obj-$(CONFIG_EFI_FAKE_MEMMAP) += fake_m
obj-$(CONFIG_EFI_BOOTLOADER_CONTROL) += efibc.o
obj-$(CONFIG_EFI_TEST) += test/
obj-$(CONFIG_EFI_DEV_PATH_PARSER) += dev-path-parser.o
@ -59,9 +55,6 @@ index 0329d319d89a..883f9f7eefc6 100644
obj-$(CONFIG_APPLE_PROPERTIES) += apple-properties.o
arm-obj-$(CONFIG_EFI) := arm-init.o arm-runtime.o
diff --git a/drivers/firmware/efi/secureboot.c b/drivers/firmware/efi/secureboot.c
new file mode 100644
index 000000000000..9070055de0a1
--- /dev/null
+++ b/drivers/firmware/efi/secureboot.c
@@ -0,0 +1,38 @@
@ -103,11 +96,9 @@ index 000000000000..9070055de0a1
+ }
+ }
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 66f4a4e79f4b..7c7a7e33e4d1 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1103,6 +1103,14 @@ extern int __init efi_setup_pcdp_console(char *);
@@ -1152,6 +1152,14 @@ extern int __init efi_setup_pcdp_console
#define EFI_DBG 8 /* Print additional debug info at runtime */
#define EFI_NX_PE_DATA 9 /* Can runtime data regions be mapped non-executable? */
#define EFI_MEM_ATTR 10 /* Did firmware publish an EFI_MEMORY_ATTRIBUTES table? */
@ -122,7 +113,7 @@ index 66f4a4e79f4b..7c7a7e33e4d1 100644
#ifdef CONFIG_EFI
/*
@@ -1115,6 +1123,7 @@ static inline bool efi_enabled(int feature)
@@ -1164,6 +1172,7 @@ static inline bool efi_enabled(int featu
extern void efi_reboot(enum reboot_mode reboot_mode, const char *__unused);
extern bool efi_is_table_address(unsigned long phys_addr);
@ -130,7 +121,7 @@ index 66f4a4e79f4b..7c7a7e33e4d1 100644
#else
static inline bool efi_enabled(int feature)
{
@@ -1133,6 +1142,7 @@ static inline bool efi_is_table_address(unsigned long phys_addr)
@@ -1182,6 +1191,7 @@ static inline bool efi_is_table_address(
{
return false;
}
@ -138,8 +129,8 @@ index 66f4a4e79f4b..7c7a7e33e4d1 100644
#endif
extern int efi_status_to_err(efi_status_t status);
@@ -1518,12 +1528,6 @@ efi_status_t efi_setup_gop(efi_system_table_t *sys_table_arg,
bool efi_runtime_disabled(void);
@@ -1572,12 +1582,6 @@ static inline bool efi_runtime_disabled(
extern void efi_call_virt_check_flags(unsigned long flags, const char *call);
-enum efi_secureboot_mode {

91
debian/patches/series vendored
View File

@ -81,10 +81,6 @@ bugfix/arm/ARM-dts-sun8i-h3-add-sy8106a-to-orange-pi-plus.patch
bugfix/arm64/arm64-dts-allwinner-a64-Enable-A64-timer-workaround.patch
bugfix/mips/MIPS-Loongson-Introduce-and-use-loongson_llsc_mb.patch
bugfix/powerpc/powerpc-vdso-make-vdso32-installation-conditional-in.patch
bugfix/mips/MIPS-scall64-o32-Fix-indirect-syscall-number-load.patch
bugfix/mips/MIPS-Bounds-check-virt_addr_valid.patch
bugfix/sparc64/sunhv-fix-device-naming-inconsistency-between-sunhv_console-and-sunhv_reg.patch
bugfix/arm64/arm64-compat-Provide-definition-for-COMPAT_SIGMINSTK.patch
# Arch features
features/mips/MIPS-increase-MAX-PHYSMEM-BITS-on-Loongson-3-only.patch
@ -106,7 +102,6 @@ bugfix/all/partially-revert-usb-kconfig-using-select-for-usb_co.patch
bugfix/all/kbuild-include-addtree-remove-quotes-before-matching-path.patch
debian/revert-objtool-fix-config_stack_validation-y-warning.patch
bugfix/all/mt76-use-the-correct-hweight8-function.patch
bugfix/all/revert-net-stmmac-send-tso-packets-always-from-queue.patch
bugfix/all/rtc-s35390a-set-uie_unsupported.patch
# Miscellaneous features
@ -163,93 +158,7 @@ features/all/db-mok-keyring/modsign-make-shash-allocation-failure-fatal.patch
# Security fixes
debian/i386-686-pae-pci-set-pci-nobios-by-default.patch
bugfix/all/xen-pciback-Don-t-disable-PCI_COMMAND-on-PCI-device-.patch
debian/ntfs-mark-it-as-broken.patch
bugfix/all/vfio-type1-Limit-DMA-mappings-per-container.patch
bugfix/all/0001-aio-clear-IOCB_HIPRI.patch
bugfix/all/0002-aio-use-assigned-completion-handler.patch
bugfix/all/0003-aio-separate-out-ring-reservation-from-req-allocatio.patch
bugfix/all/0004-aio-don-t-zero-entire-aio_kiocb-aio_get_req.patch
bugfix/all/0005-aio-use-iocb_put-instead-of-open-coding-it.patch
bugfix/all/0006-aio-split-out-iocb-copy-from-io_submit_one.patch
bugfix/all/0007-aio-abstract-out-io_event-filler-helper.patch
bugfix/all/0008-aio-initialize-kiocb-private-in-case-any-filesystems.patch
bugfix/all/0009-aio-simplify-and-fix-fget-fput-for-io_submit.patch
bugfix/all/0010-pin-iocb-through-aio.patch
bugfix/all/0011-aio-fold-lookup_kiocb-into-its-sole-caller.patch
bugfix/all/0012-aio-keep-io_event-in-aio_kiocb.patch
bugfix/all/0013-aio-store-event-at-final-iocb_put.patch
bugfix/all/0014-Fix-aio_poll-races.patch
bugfix/all/tracing-fix-buffer_ref-pipe-ops.patch
bugfix/all/0001-mm-make-page-ref-count-overflow-check-tighter-and-mo.patch
bugfix/all/0002-mm-add-try_get_page-helper-function.patch
bugfix/all/0003-mm-prevent-get_user_pages-from-overflowing-page-refc.patch
bugfix/all/0004-fs-prevent-page-refcount-overflow-in-pipe_buf_get.patch
bugfix/all/spec/0001-Documentation-l1tf-Fix-small-spelling-typo.patch
bugfix/all/spec/0002-x86-cpu-Sanitize-FAM6_ATOM-naming.patch
bugfix/all/spec/0003-kvm-x86-Report-STIBP-on-GET_SUPPORTED_CPUID.patch
bugfix/all/spec/0004-x86-msr-index-Cleanup-bit-defines.patch
bugfix/all/spec/0005-x86-speculation-Consolidate-CPU-whitelists.patch
bugfix/all/spec/0006-x86-speculation-mds-Add-basic-bug-infrastructure-for.patch
bugfix/all/spec/0007-x86-speculation-mds-Add-BUG_MSBDS_ONLY.patch
bugfix/all/spec/0008-x86-kvm-Expose-X86_FEATURE_MD_CLEAR-to-guests.patch
bugfix/all/spec/0009-x86-speculation-mds-Add-mds_clear_cpu_buffers.patch
bugfix/all/spec/0010-x86-speculation-mds-Clear-CPU-buffers-on-exit-to-use.patch
bugfix/all/spec/0011-x86-kvm-vmx-Add-MDS-protection-when-L1D-Flush-is-not.patch
bugfix/all/spec/0012-x86-speculation-mds-Conditionally-clear-CPU-buffers-.patch
bugfix/all/spec/0013-x86-speculation-mds-Add-mitigation-control-for-MDS.patch
bugfix/all/spec/0014-x86-speculation-mds-Add-sysfs-reporting-for-MDS.patch
bugfix/all/spec/0015-x86-speculation-mds-Add-mitigation-mode-VMWERV.patch
bugfix/all/spec/0016-Documentation-Move-L1TF-to-separate-directory.patch
bugfix/all/spec/0017-Documentation-Add-MDS-vulnerability-documentation.patch
bugfix/all/spec/0018-x86-speculation-mds-Add-mds-full-nosmt-cmdline-optio.patch
bugfix/all/spec/0019-x86-speculation-Move-arch_smt_update-call-to-after-m.patch
bugfix/all/spec/0020-x86-speculation-mds-Add-SMT-warning-message.patch
bugfix/all/spec/0021-x86-speculation-mds-Fix-comment.patch
bugfix/all/spec/0022-x86-speculation-mds-Print-SMT-vulnerable-on-MSBDS-wi.patch
bugfix/all/spec/0023-cpu-speculation-Add-mitigations-cmdline-option.patch
bugfix/all/spec/0024-x86-speculation-Support-mitigations-cmdline-option.patch
bugfix/all/spec/0025-powerpc-speculation-Support-mitigations-cmdline-opti.patch
bugfix/all/spec/0026-s390-speculation-Support-mitigations-cmdline-option.patch
bugfix/all/spec/0027-x86-speculation-mds-Add-mitigations-support-for-MDS.patch
bugfix/all/spec/0028-x86-mds-Add-MDSUM-variant-to-the-MDS-documentation.patch
bugfix/all/spec/0029-Documentation-Correct-the-possible-MDS-sysfs-values.patch
bugfix/all/spec/0030-x86-speculation-mds-Fix-documentation-typo.patch
bugfix/all/spec/powerpc-64s-include-cpu-header.patch
bugfix/all/brcmfmac-assure-SSID-length-from-firmware-is-limited.patch
bugfix/all/brcmfmac-add-subtype-check-for-event-handling-in-dat.patch
bugfix/all/ext4-zero-out-the-unused-memory-region-in-the-extent.patch
bugfix/all/Bluetooth-hidp-fix-buffer-overflow.patch
bugfix/all/mwifiex-fix-possible-buffer-overflows-at-parsing-bss.patch
bugfix/all/mwifiex-abort-at-too-short-bss-descriptor-element.patch
bugfix/all/mwifiex-don-t-abort-on-small-spec-compliant-vendor-ies.patch
bugfix/all/mm-mincore.c-make-mincore-more-conservative.patch
bugfix/all/mwifiex-fix-heap-overflow-in-mwifiex_uap_parse_tail_.patch
bugfix/all/tcp-limit-payload-size-of-sacked-skbs.patch
bugfix/all/tcp-tcp_fragment-should-apply-sane-memory-limits.patch
bugfix/all/tcp-add-tcp_min_snd_mss-sysctl.patch
bugfix/all/tcp-enforce-tcp_min_snd_mss-in-tcp_mtu_probing.patch
bugfix/all/tcp-refine-memory-limit-test-in-tcp_fragment.patch
bugfix/all/ptrace-Fix-ptracer_cred-handling-for-PTRACE_TRACEME.patch
bugfix/x86/x86-insn-eval-Fix-use-after-free-access-to-LDT-entry.patch
bugfix/powerpc/powerpc-mm-64s-hash-Reallocate-context-ids-on-fork.patch
bugfix/all/nfc-Ensure-presence-of-required-attributes-in-the-deactivate_target.patch
bugfix/all/binder-fix-race-between-munmap-and-direct-reclaim.patch
bugfix/all/scsi-libsas-fix-a-race-condition-when-smp-task-timeout.patch
bugfix/all/input-gtco-bounds-check-collection-indent-level.patch
bugfix/all/net-switch-IP-ID-generator-to-siphash.patch
bugfix/all/floppy-fix-div-by-zero-in-setup_format_params.patch
bugfix/all/floppy-fix-out-of-bounds-read-in-copy_buffer.patch
bugfix/all/Bluetooth-hci_uart-check-for-missing-tty-operations.patch
bugfix/powerpc/powerpc-tm-Fix-oops-on-sigreturn-on-systems-without-TM.patch
bugfix/x86/x86-cpufeatures-Carve-out-CQM-features-retrieval.patch
bugfix/x86/x86-cpufeatures-Combine-word-11-and-12-into-a-new-sc.patch
bugfix/x86/x86-speculation-Prepare-entry-code-for-Spectre-v1-sw.patch
bugfix/x86/x86-speculation-Enable-Spectre-v1-swapgs-mitigations.patch
bugfix/x86/x86-entry-64-Use-JMP-instead-of-JMPQ.patch
bugfix/x86/x86-speculation-swapgs-Exclude-ATOMs-from-speculatio.patch
bugfix/all/Documentation-Add-section-about-CPU-vulnerabilities-.patch
bugfix/all/Documentation-Add-swapgs-description-to-the-Spectre-.patch
# Fix exported symbol versions
bugfix/all/module-disable-matching-missing-version-crc.patch