Something is wrong with the network, I would appreciate some help!

Itay1778

Patron
Joined
Jan 29, 2018
Messages
269
Hi, everyone, I suddenly discovered a network performance problem in TN.
A very strange problem that I can't solve.
My TN is configured in such a way that it has access to 2 separate networks through VLANs and they are linked behind the scenes through my pfsense firewall.
And everything worked fine until about last week and the only way that changed is that I upgraded the TN both hardware and software but not anything related to the network (I added new drives and updated to TrueNAS 13.0 U5 from U3)

1686922297648.png

As you can see in the picture, I also have LAGG in the TN and it is not the problem because one of the things I checked was to delete it and check the NICs individually and each of them has this problem.
I also recreated the VLAN and everything.
The problem is that I'm trying to do iperf that the TN is the server and I'm using Subnet 3 and from where I check it's on Subnet 2 - I mean it has to go through the gateway I get 3 megabytes and then it drops to 0 Now before you say it's the problem with pfsense it's not, I did the same Testing on other servers and they don't have this problem, and I also did it the other way around checking from Subnet 3 to Subnet 2 the same problem but it only happens on TN and on other servers I don't have this problem.
1686922425184.png

And I also tried to restart all my equipment, and it did not solve this problem.
Now I will point out that if I check through a jail that sits on the bridge of VLAN3 then he doesn't have this problem at all.
It's only if I check directly with TN then there is this problem.
And I don't know if it's related, but there is a Windows VM that sits on Subet 3 and is directly connected to the same subnet as the TN (3.1) and it also has this problem, but only if I do iperf Reverse mode, which is very strange because it only happens with this client.
I don't know why it affects him because it doesn't seem like this problem it would affect him...
I hope I explained myself properly because I myself was already lost in this problem...
The Default Routes set in TN are on Subnet 2.
Another thing that I only noticed in recent tests is that there is a relatively high Retr even if I do it through the jail it does finish the test with the speed I expect and also if I do it directly to TN on Subnet2
The first picture is: iperf Reverse mode -P 30 from jail on subnet 3 and the client on 2
The second picture is: iperf Reverse mode from jail on subnet 3 and the client on 2

1686922761426.png
1686922866716.png
 

Itay1778

Patron
Joined
Jan 29, 2018
Messages
269
Please Help!!
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
The problem is that I'm trying to do iperf that the TN is the server and I'm using Subnet 3 and from where I check it's on Subnet 2 - I mean it has to go through the gateway I get 3 megabytes and then it drops to 0 Now before you say it's the problem with pfsense it's not, I did the same Testing on other servers and they don't have this problem, and I also did it the other way around checking from Subnet 3 to Subnet 2 the same problem but it only happens on TN and on other servers I don't have this problem.

I am extremely confused by this statement. Can you clarify what you are saying here?
And I don't know if it's related, but there is a Windows VM that sits on Subet 3 and is directly connected to the same subnet as the TN (3.1) and it also has this problem, but only if I do iperf Reverse mode, which is very strange because it only happens with this client.
If another device is having a problem in the same subnet it's most likely related. From what I have cobbled together here, it sounds like you have an asynchronous routing problem/loop on the pfsense box.

On the TrueNAS type:
Code:
traceroute IP_ON_THE_OTHER_SUBNET


On that Windows box type:
Code:
tracert IP_ON_THE_OTHER_SUBNET


Now from a client on that other VLAN do the opposite.

On that Windows box type:
Code:
tracert IP_OF_TN_VLAN_3

Code:
tracert IP_OF_WINDOWS_VLAN_3


I would be willing to bet you the problem :P
 

Itay1778

Patron
Joined
Jan 29, 2018
Messages
269
I am extremely confused by this statement. Can you clarify what you are saying here?
sorry for that.
I will try again, what happens is that if I do iperf or any network traffic (iperf is just to illustrate the problem)
From the TN is the server and it is in Subnet 3 in VLAN 3 with address 3.1
And the clients (let's say Linux Machine) I'm testing with are on Subnet 2 in VLAN 2
So when I do iperf I get 3Mbit and then it drops to 0 as you can see in the screenshots.
The same thing happens in the opposite way, the clients are in VLAN 3 and I try iperf to the TN in VLAN 2. And I mention again that the TN has direct access and an address in VLAN 2 and VLAN 3. Please pay attention to the screenshots of network interfaces, which I think is why this problem only happens with VLAN 2 and 3.
It just doesn't route correctly, and I don't know how to fix it.

If another device is having a problem in the same subnet it's most likely related. From what I have cobbled together here, it sounds like you have an asynchronous routing problem/loop on the pfsense box.
Ok, so it's not, this client had another problem with protection software that blocked things there, after fixing it in this software, it talks to the TN on the same subnet perfectly.

traceroute from TN to its IP on VLAN 3:

Code:
traceroute to 192.168.3.1 (192.168.3.1), 64 hops max, 40 byte packets
 1  192.168.3.1 (192.168.3.1)  0.028 ms  0.012 ms  0.010 ms


traceroute from Linux Machine on VLAN 2 to the TN IP on Subnet 3:
Code:
traceroute to 192.168.3.1 (192.168.3.1), 30 hops max, 60 byte packets
 1  192.168.2.250 (192.168.2.250)  0.365 ms  0.323 ms  0.305 ms
 2  192.168.3.1 (192.168.3.1)  1.070 ms  0.281 ms  0.265 ms


traceroute from Windows on VLAN 3 to TN IP on VLAN 3:
Code:
Tracing route to 192.168.3.1 over a maximum of 30 hops

  1    <1 ms    <1 ms    <1 ms  192.168.3.1

Trace complete.

traceroute from Windows on VLAN 3 to TN IP on VLAN 2:

Code:
Tracing route to 192.168.2.1 over a maximum of 30 hops

  1    <1 ms    <1 ms    <1 ms  192.168.3.250
  2    <1 ms    <1 ms    <1 ms  192.168.2.1

Trace complete.

Note: 3.250/2.250 is the pfsense
 
Top