VirtualBox idle Windows guest high CPU usage on host - virtualbox

VirtualBox windows 8.1 image is created on i7 system and performing perfectly (linux and windows host). When image is moved on Xeon system ~300% linux host CPU is used with the idle guest.
VBoxManage showvminfo vm
Name: vm
Groups: /
Guest OS: Windows 8.1 (64-bit)
UUID: fdb2debf-31ff-4867-8c53-a315d12d348b
Config file: /root/VirtualBox VMs/vm/vm.vbox
Snapshot folder: /root/VirtualBox VMs/vm/Snapshots
Log folder: /root/VirtualBox VMs/vm/Logs
Hardware UUID: fdb2debf-31ff-4867-8c53-a315d12d348b
Memory size 8000MB
Page Fusion: disabled
VRAM size: 128MB
CPU exec cap: 100%
HPET: disabled
CPUProfile: host
Chipset: piix3
Firmware: BIOS
Number of CPUs: 32
PAE: enabled
Long Mode: enabled
Triple Fault Reset: disabled
APIC: enabled
X2APIC: disabled
Nested VT-x/AMD-V: disabled
CPUID Portability Level: 0
CPUID overrides: None
Boot menu mode: disabled
Boot Device 1: HardDisk
Boot Device 2: Not Assigned
Boot Device 3: Not Assigned
Boot Device 4: Not Assigned
ACPI: enabled
IOAPIC: enabled
BIOS APIC mode: APIC
Time offset: 0ms
RTC: local time
Hardw. virt.ext: enabled
Nested Paging: enabled
Large Pages: enabled
VT-x VPID: enabled
VT-x unr. exec.: enabled
Paravirt. Provider: Default
Effective Paravirt. Prov.: HyperV
State: running (since 2019-05-03T04:26:59.439000000)
Monitor count: 1
3D Acceleration: disabled
2D Video Acceleration: disabled
Teleporter Enabled: disabled
Teleporter Port: 0
Teleporter Address:
Teleporter Password:
Tracing Enabled: disabled
Allow Tracing to Access VM: disabled
Tracing Configuration:
Autostart Enabled: disabled
Autostart Delay: 0
Default Frontend:
Storage Controller Name (0): cont
Storage Controller Type (0): IntelAhci
Storage Controller Instance Number (0): 0
Storage Controller Max Port Count (0): 30
Storage Controller Port Count (0): 30
Storage Controller Bootable (0): on
cont (0, 0): /root/VirtualBox VMs/vm/Snapshots/{eec50ba7-9d02-4c51-9676-4ed0a761c6ce}.vdi (UUID: eec50ba7-9d02-4c51-9676-4ed0a761c6ce)
NIC 1: MAC: 080027F44A7A, Attachment: NAT, Cable connected: on, Trace: off (file: none), Type: 82545EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 1 Settings: MTU: 0, Socket (send: 64, receive: 64), TCP Window (send:64, receive: 64)
NIC 1 Rule(0): name = http, protocol = tcp, host ip = , host port = 8000, guest ip = , guest port = 80
NIC 1 Rule(1): name = rdp, protocol = tcp, host ip = , host port = 3389, guest ip = , guest port = 3389
NIC 2: disabled
NIC 3: disabled
NIC 4: disabled
NIC 5: disabled
NIC 6: disabled
NIC 7: disabled
NIC 8: disabled
Pointing Device: PS/2 Mouse
Keyboard Device: PS/2 Keyboard
UART 1: disabled
UART 2: disabled
UART 3: disabled
UART 4: disabled
LPT 1: disabled
LPT 2: disabled
Audio: disabled
Audio playback: disabled
Audio capture: disabled
Clipboard Mode: disabled
Drag and drop Mode: disabled
Session name: headless
Video mode: 2217x1272x32 at 0,0 enabled
VRDE: enabled (Address 0.0.0.0, Ports 6666, MultiConn: off, ReuseSingleConn: on, Authentication type: external)
VRDE port: 6666
Video redirection: disabled
VRDE property : TCP/Ports = "6666"
VRDE property : TCP/Address = <not set>
VRDE property : VideoChannel/Enabled = <not set>
VRDE property : VideoChannel/Quality = <not set>
VRDE property : VideoChannel/DownscaleProtection = <not set>
VRDE property : Client/DisableDisplay = <not set>
VRDE property : Client/DisableInput = <not set>
VRDE property : Client/DisableAudio = <not set>
VRDE property : Client/DisableUSB = <not set>
VRDE property : Client/DisableClipboard = <not set>
VRDE property : Client/DisableUpstreamAudio = <not set>
VRDE property : Client/DisableRDPDR = <not set>
VRDE property : H3DRedirect/Enabled = <not set>
VRDE property : Security/Method = <not set>
VRDE property : Security/ServerCertificate = <not set>
VRDE property : Security/ServerPrivateKey = <not set>
VRDE property : Security/CACertificate = <not set>
VRDE property : Audio/RateCorrectionMode = <not set>
VRDE property : Audio/LogPath = <not set>
OHCI USB: disabled
EHCI USB: disabled
xHCI USB: disabled
USB Device Filters:
<none>
Available remote USB devices:
<none>
Currently Attached USB Devices:
<none>
Bandwidth groups: <none>
Shared folders:<none>
VRDE Connection: active
Clients so far: 3
Start time: 2019/05/03 04:27:05 UTC
Sent: 24421625Bytes
Average speed: 26504B/s
Sent total: 24421625Bytes
Received: 270108Bytes
Speed: 293B/s
Received total: 270108Bytes
User name: admin
Domain:
Client name: dev
Client IP: 192.168.0.59
Client version: 2600
Encryption: RDP4
Capturing: not active
Capture audio: not active
Capture screens:
Capture file: /root/VirtualBox VMs/vm/vm.webm
Capture dimensions: 1024x768
Capture rate: 512kbps
Capture FPS: 25kbps
Capture options:
Guest:
Configured memory balloon size: 0MB
OS type: Windows81_64
Additions run level: 3
Additions version 6.0.6 r130049
Guest Facilities:
Facility "VirtualBox Base Driver": active/running (last update: 2019/05/03 04:27:11 UTC)
Facility "VirtualBox System Service": active/running (last update: 2019/05/03 04:27:24 UTC)
Facility "VirtualBox Desktop Integration": active/running (last update: 2019/05/03 04:27:53 UTC)
Facility "Seamless Mode": active/running (last update: 2019/05/03 04:27:11 UTC)
Facility "Graphics Mode": active/running (last update: 2019/05/03 04:27:11 UTC)

Solution is counter-intuitive, limit guest OS run on one (1) core. It brings extremely fast OS boot times and in my case best processing performance. All this has something to do with CPU context switching and optimizations at OS or hardware levels.
Didn't figured out what exactly makes VBox perform this way but my guess is that something related to CPU and not limited to i7 / Xeon differences.
Found this usefull:
https://www.reddit.com/r/linux/comments/1tqlsz/adding_cpus_to_virtualbox_guests_makes_guests/

In my case it was needed to increase cpu count from 1 to 2 in the virtual machine settings (the host cpu load was constantly 25% before and became 1% after).

Related

Running F-stack DPDK executable - Unsupported Rx multi queue mode 1

I have a C++ program which does lots of stuff, but most importantly it is setup to use F-Stack, which is built on DPDK:
int main(int argc, char * argv[])
{
ff_init(argc, argv);
...
}
And I run the program like this:
sudo ./main --conf /etc/f-stack.conf --proc-type=primary
This is the error message I am receiving:
virtio_dev_configure(): Unsupported Rx multi queue mode 1
Port0 dev_configure = -22
EAL: Error - exiting with code: 1
Cause: init_port_start failed
I have not had this problem before when running this executable on a Centos 8 AWS instance. I am now running this on a Centos 8 Alibaba Cloud instance. So there's possibly some difference when running on Alibaba.
The only other thing I can think of is that there might be a configuration problem. However, I copied /etc/f-stack.conf from my AWS instance to Alibaba and updated some IP addresses, nothing else. So nothing significant has changed.
Any idea what's going on here and how to fix it?
Edit: here is my /etc/f-stack.conf file (without IP addresses included):
[dpdk]
# Hexadecimal bitmask of cores to run on.
lcore_mask=1
# Number of memory channels.
channel=4
# Specify base virtual address to map.
#base_virtaddr=0x7f0000000000
# Promiscuous mode of nic, defualt: enabled.
promiscuous=1
numa_on=1
# TX checksum offload skip, default: disabled.
# We need this switch enabled in the following cases:
# -> The application want to enforce wrong checksum for testing purposes
# -> Some cards advertize the offload capability. However, doesn't calculate che cksum.
tx_csum_offoad_skip=0
# TCP segment offload, default: disabled.
tso=0
# HW vlan strip, default: enabled.
vlan_strip=1
# sleep when no pkts incoming
# unit: microseconds
idle_sleep=0
# sent packet delay time(0-100) while send less than 32 pkts.
# default 100 us.
# if set 0, means send pkts immediately.
# if set >100, will dealy 100 us.
# unit: microseconds
pkt_tx_delay=100
# use symmetric Receive-side Scaling(RSS) key, default: disabled.
symmetric_rss=0
# PCI device enable list.
# And driver options
#pci_whitelist=02:00.0
# enabled port list
#
# EBNF grammar:
#
# exp ::= num_list {"," num_list}
# num_list ::= <num> | <range>
# range ::= <num>"-"<num>
# num ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
#
# examples
# 0-3 ports 0, 1,2,3 are enabled
# 1-3,4,7 ports 1,2,3,4,7 are enabled
#
# If use bonding, shoule config the bonding port id in port_list
# and not config slave port id in port_list
# such as, port 0 and port 1 trank to a bonding port 2,
# should set `port_list=2` and config `[port2]` section
port_list=0
# Number of vdev.
nb_vdev=0
# Number of bond.
nb_bond=0
# Each core write into own pcap file, which is open one time, close one time if enough.
# Support dump the first snaplen bytes of each packet.
# if pcap file is lager than savelen bytes, it will be closed and next file was dumped into.
[pcap]
enable=0
snaplen=96
savelen=16777216
savepath=.
# Port config section
# Correspond to dpdk.port_list's index: port0, port1...
[port0]
addr=<ADDR>
netmask=<NETMASK>
broadcast=<BROADCAST>
gateway=<GATEWAY>
# IPv6 net addr, Optional parameters.
#addr6=ff::02
#prefix_len=64
#gateway6=ff::01
# Multi virtual IPv4/IPv6 net addr, Optional parameters.
# `vip_ifname`: default `f-stack-x`
# `vip_addr`: Separated by semicolons, MAX number 64;
# Only support netmask 255.255.255.255, broadcast x.x.x.255 no w, hard code in `ff_veth_setvaddr`.
# `vip_addr6`: Separated by semicolons, MAX number 64.
# `vip_prefix_len`: All addr6 use the same prefix now, default 64.
#vip_ifname=lo0
#vip_addr=192.168.1.3;192.168.1.4;192.168.1.5;192.168.1.6
#vip_addr6=ff::03;ff::04;ff::05;ff::06;ff::07
#vip_prefix_len=64
# lcore list used to handle this port
# the format is same as port_list
#lcore_list=0
# bonding slave port list used to handle this port
# need to config while this port is a bonding port
# the format is same as port_list
#slave_port_list=0,1
# Vdev config section
# orrespond to dpdk.nb_vdev's index: vdev0, vdev1...
# iface : Shouldn't set always.
# path : The vuser device path in container. Required.
# queues : The max queues of vuser. Optional, default 1, greater or equal to the number of processes.
# queue_size : Queue size.Optional, default 256.
# mac : The mac address of vuser. Optional, default random, if vhost use phy NIC, it should be set to the phy NIC's mac.
# cq : Optional, if queues = 1, default 0; if queues > 1 default 1.
#[vdev0]
##iface=/usr/local/var/run/openvswitch/vhost-user0
#path=/var/run/openvswitch/vhost-user0
#queues=1
#queue_size=256
#mac=00:00:00:00:00:01
#cq=0
# bond config section
# See http://doc.dpdk.org/guides/prog_guide/link_bonding_poll_mode_drv_lib.html
#[bond0]
#mode=4
#slave=0000:0a:00.0,slave=0000:0a:00.1
#primary=0000:0a:00.0
#mac=f0:98:38:xx:xx:xx
## opt argument
#socket_id=0
#xmit_policy=l23
#lsc_poll_period_ms=100
#up_delay=10
#down_delay=50
# Kni config: if enabled and method=reject,
# all packets that do not belong to the following tcp_port and udp_port
# will transmit to kernel; if method=accept, all packets that belong to
# the following tcp_port and udp_port will transmit to kernel.
#[kni]
#enable=1
#method=reject
# The format is same as port_list
#tcp_port=80,443
#udp_port=53
# FreeBSD network performance tuning configurations.
# Most native FreeBSD configurations are supported.
[freebsd.boot]
hz=100
# Block out a range of descriptors to avoid overlap
# with the kernel's descriptor space.
# You can increase this value according to your app.
fd_reserve=1024
kern.ipc.maxsockets=262144
net.inet.tcp.syncache.hashsize=4096
net.inet.tcp.syncache.bucketlimit=100
net.inet.tcp.tcbhashsize=65536
kern.ncallout=262144
kern.features.inet6=1
net.inet6.ip6.auto_linklocal=1
net.inet6.ip6.accept_rtadv=2
net.inet6.icmp6.rediraccept=1
net.inet6.ip6.forwarding=0
[freebsd.sysctl]
kern.ipc.somaxconn=32768
kern.ipc.maxsockbuf=16777216
net.link.ether.inet.maxhold=5
net.inet.tcp.fast_finwait2_recycle=1
net.inet.tcp.sendspace=16384
net.inet.tcp.recvspace=8192
#net.inet.tcp.nolocaltimewait=1
net.inet.tcp.cc.algorithm=cubic
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.sendbuf_auto=1
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.sendbuf_inc=16384
net.inet.tcp.recvbuf_inc=524288
net.inet.tcp.sack.enable=1
net.inet.tcp.blackhole=1
net.inet.tcp.msl=2000
net.inet.tcp.delayed_ack=0
net.inet.udp.blackhole=1
net.inet.ip.redirect=0
net.inet.ip.forwarding=0
Edit 2: I added pci_whitelist=[PCIe BDF of NIC] to config and ran the following command:
The reason for the error is because of the check for virtio PMD in function virtio_dev_configure file [dpdk root folder]drivers/net/virtio/virtio_ethdev.c. This can be due to the current Fstack enables RSS for better flow distribution over its port-queue pair.
There 2 possible solution to fix the problem is to
find the configuration parameter in f-stack.conf to disable RSS or
change the FSTACK port configuration logic not to use RSS (by editing code).
File: lib/ff_dpdk_if.c
edit: line 627 from port_conf.rxmode.mq_mode = ETH_MQ_RX_RSS; to port_conf.rxmode.mq_mode = ETH_MQ_RX_NONE; and rebuild the fstack
Note: if you use physical NIC RSS is supported in most of cases. hence there will be no error there.

Terraform: how do I tell a Windows EC2 instance to use the entire disk? (PowerShell in userdata)

I have a Windows and a Linux AWS instance that are started in a near-identical way:
resource "aws_instance" "test-instance" {
key_name = var.key_name
subnet_id = var.subnet_id
ami = data.aws_ami.image.id
instance_type = var.instance_type
security_groups = var.security_groups
associate_public_ip_address = "true"
disable_api_termination = "false"
monitoring = "false"
tags = {
Name = "test-instance-linux-${local.timestamp}"
}
root_block_device {
volume_size = var.root_volume_size
}
lifecycle {
ignore_changes = [
tags.Name
]
}
}
test-instance-linux-${local.timestamp} for the Linux instance
test-instance-windows-${local.timestamp} for the Windows instance
var.root_volume_size equals '200' (GiB).
When I use this, then I see in the AWS console that both instances, Windows and Linux, have a 200 GiB volume as device /dev/sda1. Great.
However, while the Linux instance uses the entire 200 GiB, on Windows I only have 30 GiB available, which is most likely the size of the original Windows_Server-2019-English-Full-Base-... AMI I started from (customized with Packer).
How do I tell the Windows instance to use the entire disk?
Already tried
I added this to the resource "aws_instance" "test-instance":
user_data = file("userdata.txt")
And this is in userdata.txt:
<powershell>
# Variable specifying the drive you want to extend
$drive_letter = "C"
# Script to get the partition sizes and then resize the volume
$size = (Get-PartitionSupportedSize -DriveLetter $drive_letter)
Resize-Partition -DriveLetter $drive_letter -Size $size.SizeMax
</powershell>
(PowerShell code found on https://learn.microsoft.com/en-us/windows-server/storage/disk-management/extend-a-basic-volume)
However, actual free disk space is ~7.7 GiB (of 30), instead of expected ~177.7 GiB.
Disk info:
PS C:\Users\foobar> Get-PartitionSupportedSize -DriveLetter "C"
SizeMin SizeMax
------- -------
23978442752 214746267648
PS C:\Users\foobar> Get-Disk
Number Friendly Name Serial Number HealthStatus OperationalStatus Total Size Partition
Style
------ ------------- ------------- ------------ ----------------- ---------- ----------
0 NVMe Amazon Elastic B vol004fab740dda0c847_00000001. Healthy Online 200 GB MBR
PS C:\Users\foobar> Get-Partition
DiskPath: \\?\scsi#disk&ven_nvme&prod_amazon_elastic_b#4&26a12046&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}
PartitionNumber DriveLetter Offset Size Type
--------------- ----------- ------ ---- ----
1 C 1048576 30 GB IFS
PS C:\Users\foobar> Get-Volume
DriveLetter FriendlyName FileSystemType DriveType HealthStatus OperationalStatus SizeRemaining Size
----------- ------------ -------------- --------- ------------ ----------------- ------------- ----
C NTFS Fixed Healthy OK 7.7 GB 30 GB
It appears as if Terraform didn't run the PowerShell script?
When I manually log in over SSH and I manually run the PowerShell script, then it works, so I know that the script itself is correct.
PS C:\Users\ansible> Resize-Partition -DriveLetter 'C' -Size '214746267648'
PS C:\Users\ansible> Get-Partition
DiskPath: \\?\scsi#disk&ven_nvme&prod_amazon_elastic_b#4&26a12046&0&000000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}
PartitionNumber DriveLetter Offset Size Type
--------------- ----------- ------ ---- ----
1 C 1048576 200 GB IFS
PS C:\Users\foobar> Get-Volume
DriveLetter FriendlyName FileSystemType DriveType HealthStatus OperationalStatus SizeRemaining Size
----------- ------------ -------------- --------- ------------ ----------------- ------------- ----
C NTFS Fixed Healthy OK 177.69 GB 200 GB
So how do I make sure that Terraform actually tells the Windows instance to execute that PowerShell script?
Instead of using a PowerShell script and modifying the size after creation/boot it would be cleaner to customise the AMI and increase the boot volume size directly.
The steps to do that are explained in he AWS documentation but it boils down to spinning up an instance, doing the necessary modifications, creating a snapshot and turning this into a new AMI.

Connecting cassandra-stress to AWS Keyspaces

I've provisions a Keyspace on AWS and in order to make sure it can achieve our desired performance I'm trying to run the cassandra-stress tool on it and compare it to other architectures we're experimenting with.
I managed to connect to it using the following cqlshrc:
[connection]
port = 9142
factory = cqlshlib.ssl.ssl_transport_factory
[ssl]
validate = true
certfile = /root/.cassandra/AmazonRootCA1.pem
And the following command (hoping that soon enough there will be Python3 support, the development was completed this February according to their Jira ticket):
cqlsh cassandra.eu-central-1.amazonaws.com 9142 -u "myuser-at-722222222222" -p "12/12ZmHmtD1klsDk9cgqt/XXXXXXXXxUz6Sy687z/U=" --ssl --cqlversion="3.4.4"
Surprisingly or not, when using the official AWS guides things tend to work.
So I went on and tried connecting the cassandra-stress tool (I have it inside a Docker container, I'd rather keep my OS Java free) to the same Keyspace.
First I converted the AWS AmazonRootCA1.pem into cassandra_truststore.jks using the following commands (explained here):
openssl x509 -outform der -in AmazonRootCA1.pem -out temp_file.der
keytool -import -alias cassandra -keystore cassandra_truststore.jks -file temp_file.der
Now when I'm trying to run the actual tool like this:
./cassandra-stress write -node cassandra.eu-central-1.amazonaws.com -port native=9142 thrift=9142 jmx=9142 -transport truststore=/root/.cassandra/cassandra_truststore.jks truststore-password=mypassword -mode native cql3 user="myuser-at-722222222222" password="12/12ZmHmtD1klsDk9cgqt/XXXXXXXXxUz6Sy687z/U="
I'm getting the following error:
******************** Stress Settings ********************
Command:
Type: write
Count: -1
No Warmup: false
Consistency Level: LOCAL_ONE
Target Uncertainty: 0.020
Minimum Uncertainty Measurements: 30
Maximum Uncertainty Measurements: 200
Key Size (bytes): 10
Counter Increment Distibution: add=fixed(1)
Rate:
Auto: true
Min Threads: 4
Max Threads: 1000
Population:
Sequence: 1..1000000
Order: ARBITRARY
Wrap: true
Insert:
Revisits: Uniform: min=1,max=1000000
Visits: Fixed: key=1
Row Population Ratio: Ratio: divisor=1.000000;delegate=Fixed: key=1
Batch Type: not batching
Columns:
Max Columns Per Key: 5
Column Names: [C0, C1, C2, C3, C4]
Comparator: AsciiType
Timestamp: null
Variable Column Count: false
Slice: false
Size Distribution: Fixed: key=34
Count Distribution: Fixed: key=5
Errors:
Ignore: false
Tries: 10
Log:
No Summary: false
No Settings: false
File: null
Interval Millis: 1000
Level: NORMAL
Mode:
API: JAVA_DRIVER_NATIVE
Connection Style: CQL_PREPARED
CQL Version: CQL3
Protocol Version: V4
Username: myuser-at-722222222222
Password: *suppressed*
Auth Provide Class: null
Max Pending Per Connection: 128
Connections Per Host: 8
Compression: NONE
Node:
Nodes: [cassandra.eu-central-1.amazonaws.com]
Is White List: false
Datacenter: null
Schema:
Keyspace: keyspace1
Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Replication Strategy Pptions: {replication_factor=1}
Table Compression: null
Table Compaction Strategy: null
Table Compaction Strategy Options: {}
Transport:
factory=org.apache.cassandra.thrift.TFramedTransportFactory; truststore=/root/.cassandra/cassandra_truststore.jks; truststore-password=mypassword; keystore=null; keystore-password=null; ssl-protocol=TLS; ssl-alg=SunX509; store-type=JKS; ssl-ciphers=TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA;
Port:
Native Port: 9142
Thrift Port: 9142
JMX Port: 9142
Send To Daemon:
*not set*
Graph:
File: null
Revision: unknown
Title: null
Operation: WRITE
TokenRange:
Wrap: false
Split Factor: 1
java.lang.RuntimeException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: cassandra.eu-central-1.amazonaws.com/3.127.48.183:9142 (com.datastax.driver.core.exceptions.TransportException: [cassandra.eu-central-1.amazonaws.com/3.127.48.183] Channel has been closed))
at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:220)
at org.apache.cassandra.stress.settings.SettingsSchema.createKeySpacesNative(SettingsSchema.java:79)
at org.apache.cassandra.stress.settings.SettingsSchema.createKeySpaces(SettingsSchema.java:69)
at org.apache.cassandra.stress.settings.StressSettings.maybeCreateKeyspaces(StressSettings.java:228)
at org.apache.cassandra.stress.StressAction.run(StressAction.java:57)
at org.apache.cassandra.stress.Stress.run(Stress.java:143)
at org.apache.cassandra.stress.Stress.main(Stress.java:62)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: cassandra.eu-central-1.amazonaws.com/3.127.48.183:9142 (com.datastax.driver.core.exceptions.TransportException: [cassandra.eu-central-1.amazonaws.com/3.127.48.183] Channel has been closed))
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:233)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1424)
at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:403)
at org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:160)
at org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:211)
... 6 more
I've tried changing some parameters such as the jks password etc. (Just in case I was wrong) but I got a different error message so it's probably not the case.
Did I miss something?
Try using TLP Stress instead.
tlp-stress run RandomPartitionAccess -d 10m --host cassandra.us-east-1.amazonaws.com --port 9142 --username alice --password fLyWYFlTCD5J2gzGAZ –ssl --max-requests 4000 --dc us-east-2 --threads 10
https://thelastpickle.com/tlp-stress/

what address should i use for broadcast_rpc_address in cassandra.yaml

cluster_name: 'Cassandra Cluster'
num_tokens: 256
hinted_handoff_enabled: true
max_hint_window_in_ms: 10800000
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
authenticator: AllowAllAuthenticator
authorizer: AllowAllAuthorizer
permissions_validity_in_ms: 2000
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
data_file_directories:
- /var/lib/cassandra/data
commitlog_directory: /var/lib/cassandra/commitlog
disk_failure_policy: stop
key_cache_size_in_mb:
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
saved_caches_directory: /var/lib/cassandra/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "seednode-A-IP,seednode-B-IP,seednode-C-IP"
concurrent_reads: 32
concurrent_writes: 32
trickle_fsync: false
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
listen_address: 10.8.9.83
start_native_transport: true
native_transport_port: 9042
start_rpc: true
rpc_address: 0.0.0.0
broadcast_rpc_address: NAT-GATEWAY-IP
rpc_port: 9160
rpc_keepalive: true
rpc_server_type: sync
thrift_framed_transport_size_in_mb: 15
incremental_backups: false
snapshot_before_compaction: false
auto_snapshot: true
tombstone_warn_threshold: 1000
tombstone_failure_threshold: 100000
column_index_size_in_kb: 64
compaction_throughput_mb_per_sec: 16
read_request_timeout_in_ms: 5000
range_request_timeout_in_ms: 10000
write_request_timeout_in_ms: 2000
cas_contention_timeout_in_ms: 1000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 10000
cross_node_timeout: false
endpoint_snitch: Ec2Snitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
server_encryption_options:
internode_encryption: none
keystore: conf/.keystore
keystore_password: cassandra
truststore: conf/.truststore
truststore_password: cassandra
client_encryption_options:
enabled: false
keystore: conf/.keystore
keystore_password: cassandra
internode_compression: all
inter_dc_tcp_nodelay: false
we have Cassandra cluster deployed on AWS with 3 seed nodes (static ENI attached) and 3 non seed nodes in Auto scaling group.
I have rpc_address set to 0.0.0.0, can someone tell me what should be the broadcast_rpc_address in cassandra.yaml file?
I was using cassandra 2.0.7 version before and was able to connect to cluster fine with just rpc_address as 0.0.0.0 and without broadcast_rpc_address this unset, but when i upgraded to 3.11.1 it is giving me error
CassandraDaemon.java:708 - Exception encountered during startup: If rpc_address is set to a wildcard address (0.0.0.0), then you must set broadcast_rpc_address to a value other than 0.0.0.0
If you are using Cassandra 2.1 or greater, you can configure broadcast_rpc_address. you can configure broadcast_rpc_address to be public IP.
Cassandra uses the "broadcast_address/listen_address" for internode connectivity and "broadcast_rpc_address/rpc_address" for the rpc interface (client -> coordinator (Cassandra node) requests).

Vagrant usbfilter make the guest machine entered an invalid state

Based on the following instructions:
https://gist.github.com/dergachev/3866825#vagrant-setup
Ubuntu Linaro
uname -a
Linux ken-desktop 3.11.0-18-generic #32-Ubuntu SMP Tue Feb 18 21:11:14 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
cat /proc/version
Linux version 3.11.0-18-generic (buildd#toyol) (gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu8) ) #32-Ubuntu SMP Tue Feb 18 21:11:14 UTC 2014
Virtualbox-4.2
VBoxManage --version
4.2.16_Ubuntur86992
Vagrant 1.5 vagrant_1.5.0_x86_64.deb
In a cookbooks folder I cloned the following chef cookbooks:
git clone git://github.com/opscode-cookbooks/vim.git
git clone git://github.com/opscode-cookbooks/git.git
git clone git://github.com/opscode-cookbooks/apt.git
git clone git://github.com/tiokksar/chef-oh-my-zsh-solo.git
git clone git://github.com/opscode-cookbooks/openssl.git
git clone git://github.com/getaroom/chef-couchbase.git
I also installed this:
https://github.com/dotless-de/vagrant-vbguest
vagrant plugin install vagrant-vbguest
I try to make a nice Vagrantfile creating a precise64 VM with usb automatically mounted.
But each time I try to add an usbfilter on my virtualbox VM, I end up with that message:
% vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'hashicorp/precise64'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'hashicorp/precise64' is up to date...
==> default: Setting the name of the VM: smartofficeVM_default_1395303674511_42792
==> default: The cookbook path '/home/ken/smartofficeVM/databags' doesn't exist. Ignoring...
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
default: Adapter 1: nat
==> default: Forwarding ports...
default: 3000 => 3000 (adapter 1)
default: 22 => 2222 (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
default: SSH address: 127.0.0.1:2222
default: SSH username: vagrant
default: SSH auth method: private key
default: Error: Connection refused. Retrying...
default: Error: Connection refused. Retrying...
default: Error: Connection refused. Retrying...
default: Error: Connection refused. Retrying...
default: Error: Connection refused. Retrying...
The guest machine entered an invalid state while waiting for it
to boot. Valid states are 'starting, running'. The machine is in the
'poweroff' state. Please verify everything is configured
properly and try again.
If the provider you're using has a GUI that comes with it,
it is often helpful to open that and watch the machine, since the
GUI often has more helpful error messages than Vagrant can retrieve.
For example, if you're using VirtualBox, run `vagrant up` while the
VirtualBox GUI is open.
my configuration file is the following:
% cat Vagrantfile
# -*- mode: ruby -*-
# vi: set ft=ruby :
# Vagrantfile API/syntax version. Don't touch unless you know what you're doing!
VAGRANTFILE_API_VERSION = "2"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
# All Vagrant configuration is done here. The most common configuration
# options are documented and commented below. For a complete reference,
# please see the online documentation at vagrantup.com.
# Every Vagrant virtual environment requires a box to build off of.
config.vm.box = "hashicorp/precise64"
# The url from where the 'config.vm.box' box will be fetched if it
# doesn't already exist on the user's system.
# config.vm.box_url = "http://domain.com/path/to/above.box"
# Create a forwarded port mapping which allows access to a specific port
# within the machine from a port on the host machine. In the example below,
# accessing "localhost:8080" will access port 80 on the guest machine.
# config.vm.network "forwarded_port", guest: 80, host: 8080
# Create a private network, which allows host-only access to the machine
# using a specific IP.
# config.vm.network "private_network", ip: "192.168.33.10"
# Create a public network, which generally matched to bridged network.
# Bridged networks make the machine appear as another physical device on
# your network.
config.vm.network "public_network"
# If true, then any SSH connections made will enable agent forwarding.
# Default value: false
# config.ssh.forward_agent = true
# Share an additional folder to the guest VM. The first argument is
# the path on the host to the actual folder. The second argument is
# the path on the guest to mount the folder. And the optional third
# argument is a set of non-required options.
# config.vm.synced_folder "../data", "/vagrant_data"
# Provider-specific configuration so you can fine-tune various
# backing providers for Vagrant. These expose provider-specific options.
# Example for VirtualBox:
#
# config.vm.provider "virtualbox" do |vb|
# # Don't boot with headless mode
# vb.gui = true
#
# # Use VBoxManage to customize the VM. For example to change memory:
# vb.customize ["modifyvm", :id, "--memory", "1024"]
# end
#
# View the documentation for the provider you're using for more
# information on available options.
# Enable provisioning with Puppet stand alone. Puppet manifests
# are contained in a directory path relative to this Vagrantfile.
# You will need to create the manifests directory and a manifest in
# the file hashicorp/precise32.pp in the manifests_path directory.
#
# An example Puppet manifest to provision the message of the day:
#
# # group { "puppet":
# # ensure => "present",
# # }
# #
# # File { owner => 0, group => 0, mode => 0644 }
# #
# # file { '/etc/motd':
# # content => "Welcome to your Vagrant-built virtual machine!
# # Managed by Puppet.\n"
# # }
#
# config.vm.provision "puppet" do |puppet|
# puppet.manifests_path = "manifests"
# puppet.manifest_file = "site.pp"
# end
# Enable provisioning with chef solo, specifying a cookbooks path, roles
# path, and data_bags path (all relative to this Vagrantfile), and adding
# some recipes and/or roles.
#
config.vm.provision "chef_solo" do |chef|
chef.cookbooks_path = "cookbooks"
#chef.roles_path = "../my-recipes/roles"
chef.data_bags_path = "databags"
#chef.add_role "web"
chef.add_recipe "apt"
chef.add_recipe "zsh"
chef.add_recipe "chef-oh-my-zsh-solo"
chef.add_recipe "vim"
chef.add_recipe "git"
chef.add_recipe "openssl"
chef.add_recipe "couchbase::server"
# setup users (from data_bags/users/*.json)
# chef.add_recipe "users::sysadmins" # creates users and sysadmin group
# chef.add_recipe "users"
# chef.add_recipe "users::sysadmin_sudo" # adds %sysadmin group to sudoers
# homesick_agent and its dependencies
# chef.add_recipe "root_ssh_agent::ppid" # maintains agent during 'sudo su root'
# chef.add_recipe "ssh_known_hosts"
# populates /etc/ssh/ssh_known_hosts from data_bags/ssh_known_hosts/*.json
# You may also specify custom JSON attributes:
#chef.json = { :users => "admin" }
chef.json = {
"couchbase" => {
"server"=> {
"password" => "123"
}
}
}
chef.log_level = :debug
end
# Enable provisioning with chef server, specifying the chef server URL,
# and the path to the validation key (relative to this Vagrantfile).
#
# The Opscode Platform uses HTTPS. Substitute your organization for
# ORGNAME in the URL and validation key.
#
# If you have your own Chef Server, use the appropriate URL, which may be
# HTTP instead of HTTPS depending on your configuration. Also change the
# validation key to validation.pem.
#
# config.vm.provision "chef_client" do |chef|
# chef.chef_server_url = "https://api.opscode.com/organizations/ORGNAME"
# chef.validation_key_path = "ORGNAME-validator.pem"
# end
#
# If you're using the Opscode platform, your validator client is
# ORGNAME-validator, replacing ORGNAME with your organization name.
#
# If you have your own Chef Server, the default validation client name is
# chef-validator, unless you changed the configuration.
#
# chef.validation_client_name = "ORGNAME-validator"
end
On detail is: If I remove the following lines, it's starting properly (but no usb available)
vb.customize ["modifyvm", :id, "--usb", "on"]
vb.customize ["modifyvm", :id, "--usbehci", "on"]
EDIT
Logs from Vlogs file
cat VBox.log
VirtualBox VM 4.2.16_Ubuntu r86992 linux.amd64 (Sep 21 2013 11:46:57) release log
00:00:00.033561 Log opened 2014-03-20T08:21:15.686771000Z
00:00:00.033570 OS Product: Linux
00:00:00.033572 OS Release: 3.11.0-18-generic
00:00:00.033575 OS Version: #32-Ubuntu SMP Tue Feb 18 21:11:14 UTC 2014
00:00:00.033610 DMI Product Name:
00:00:00.033624 DMI Product Version:
00:00:00.033756 Host RAM: 3882MB total, 3328MB available
00:00:00.033763 Executable: /usr/lib/virtualbox/VBoxHeadless
00:00:00.033765 Process ID: 10288
00:00:00.033767 Package type: LINUX_64BITS_GENERIC (OSE)
00:00:00.039722 Installed Extension Packs:
00:00:00.039747 VNC (Version: 4.2.16 r86992; VRDE Module: VBoxVNC)
00:00:00.046777 SUP: Loaded VMMR0.r0 (/usr/lib/virtualbox/VMMR0.r0) at 0xffffffffa0518020 - ModuleInit at ffffffffa052e0f0 and ModuleTerm at ffffffffa052e390
00:00:00.046820 SUP: VMMR0EntryEx located at ffffffffa052f510, VMMR0EntryFast at ffffffffa052f240 and VMMR0EntryInt at ffffffffa052f230
00:00:00.049809 OS type: 'Ubuntu_64'
00:00:00.073143 File system of '/home/ken/VirtualBox VMs/smartofficeVM_default_1395303674511_42792/Snapshots' (snapshots) is unknown
00:00:00.073166 File system of '/home/ken/VirtualBox VMs/smartofficeVM_default_1395303674511_42792/box-disk1.vmdk' is ext4
00:00:00.091096 VMSetError: /build/buildd/virtualbox-4.2.16-dfsg/src/VBox/Main/src-client/ConsoleImpl2.cpp(2300) int Console::configConstructorInner(PVM, util::AutoWriteLock*); rc=VERR_NOT_FOUND
00:00:00.091111 VMSetError: Implementation of the USB 2.0 controller not found!
00:00:00.091113 Because the USB 2.0 controller state is part of the saved VM state, the VM cannot be started. To fix this problem, either install the 'Oracle VM VirtualBox Extension Pack' or disable USB 2.0 support in the VM settings
00:00:00.217513 ERROR [COM]: aRC=NS_ERROR_FAILURE (0x80004005) aIID={db7ab4ca-2a3f-4183-9243-c1208da92392} aComponent={Console} aText={Implementation of the USB 2.0 controller not found!
00:00:00.217535 Because the USB 2.0 controller state is part of the saved VM state, the VM cannot be started. To fix this problem, either install the 'Oracle VM VirtualBox Extension Pack' or disable USB 2.0 support in the VM settings (VERR_NOT_FOUND)}, preserve=false
00:00:00.224473 Power up failed (vrc=VERR_NOT_FOUND, rc=NS_ERROR_FAILURE (0X80004005))
VAGRANT= debug vragant up log
http://pastebin.com/2GMhmy9T
Anybody has some expertise on the topic?
Thank you very much.
SOLUTION: I though it was already installed... when reading : 00:00:00.039722 Installed Extension Packs: 00:00:00.039747 VNC (Version: 4.2.16 r86992; VRDE Module: VBoxVNC) But in fact I have to install the extension guest pack on the Host too. It's a bit confusing. thank you very much. You can add a proper answer, I ll validate it.
The following line in VBox logs:
00:00:00.217535 Because the USB 2.0 controller state is part of the saved VM state, the VM cannot be started. To fix this problem, either install the 'Oracle VM VirtualBox Extension Pack' or disable USB 2.0 support in the VM settings (VERR_NOT_FOUND)}, preserve=false
Highlights that you have to install the VirtualBox Extension Pack in order to fix the issue.
Download and install VirtualBox extension pack from there (according to your VirtualBox version). It may solve your problem.