Staging
v0.8.1
Revision a0783cd0c810504427777e8aae20d5f4f8b652a0 authored by Corinna Vinschen on 19 November 2019, 09:09:39 UTC, committed by David S. Miller on 20 November 2019, 00:41:11 UTC
During performance testing, I found that one of my r8169 NICs suffered
a major performance loss, a 8168c model.

Running netperf's TCP_STREAM test didn't return the expected
throughput of > 900 Mb/s, but rather only about 22 Mb/s.  Strange
enough, running the TCP_MAERTS and UDP_STREAM tests all returned with
throughput > 900 Mb/s, as did TCP_STREAM with the other r8169 NICs I can
test (either one of 8169s, 8168e, 8168f).

Bisecting turned up commit 93681cd7d94f83903cb3f0f95433d10c28a7e9a5,
"r8169: enable HW csum and TSO" as the culprit.

I added my 8168c version, RTL_GIGA_MAC_VER_22, to the code
special-casing the 8168evl as per the patch below.  This fixed the
performance problem for me.

Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO")
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent c9d55b6
Raw File
generic_mpih-add1.c
// SPDX-License-Identifier: GPL-2.0-or-later
/* mpihelp-add_1.c  -  MPI helper functions
 * Copyright (C) 1994, 1996, 1997, 1998,
 *               2000 Free Software Foundation, Inc.
 *
 * This file is part of GnuPG.
 *
 * Note: This code is heavily based on the GNU MP Library.
 *	 Actually it's the same code with only minor changes in the
 *	 way the data is stored; this is to support the abstraction
 *	 of an optional secure memory allocation which may be used
 *	 to avoid revealing of sensitive data due to paging etc.
 *	 The GNU MP Library itself is published under the LGPL;
 *	 however I decided to publish this code under the plain GPL.
 */

#include "mpi-internal.h"
#include "longlong.h"

mpi_limb_t
mpihelp_add_n(mpi_ptr_t res_ptr, mpi_ptr_t s1_ptr,
	      mpi_ptr_t s2_ptr, mpi_size_t size)
{
	mpi_limb_t x, y, cy;
	mpi_size_t j;

	/* The loop counter and index J goes from -SIZE to -1.  This way
	   the loop becomes faster.  */
	j = -size;

	/* Offset the base pointers to compensate for the negative indices. */
	s1_ptr -= j;
	s2_ptr -= j;
	res_ptr -= j;

	cy = 0;
	do {
		y = s2_ptr[j];
		x = s1_ptr[j];
		y += cy;	/* add previous carry to one addend */
		cy = y < cy;	/* get out carry from that addition */
		y += x;		/* add other addend */
		cy += y < x;	/* get out carry from that add, combine */
		res_ptr[j] = y;
	} while (++j);

	return cy;
}
back to top