Pages

Saturday, May 13, 2017

Installing MATE desktop environment on Fedora 25


(1) sudo dnf install @mate-desktop-environemnt

After installing Cinnamon, among the default Gnome3, the transaction fails at:

The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'dnf clean packages'.
Error: Transaction check error:
  file /usr/share/man/man1/vim.1.gz from install of vim-common-2:8.0.586-1.fc25.x86_64 conflicts with file from package vim-minimal-2:7.4.1989-2.fc25.x86_64

Error Summary
-------------


This appears to be a long standing bug, not realted to MATE desktop environment install but rather with the conflict between vim-minimal and the update which can be seen here:  https://bugzilla.redhat.com/show_bug.cgi?id=1329015

Solution, running:

[root@system76-f25 ~]# dnf update -v --debugsolver vim-minimal -y

Updated vim and then MATE installed successfully. 

Lesson - always update before install new software.

Friday, May 12, 2017

Arc Touch Bluetooth Mouse in Fedora 25

Had some issues using the default bluez bluetook GUI within Fedora 25 connecting to the Arc Touch Bluetooth mouse in Fedora 25.  The bluetooth GUI would show "Not Setup" and no amount of clicking would allow me to perform any actions on the mouse.

Then I found bluetoothctl, which when ran from the command line allows you to perform actions with devices the bluetooth adapter can see to which it can pair and connect.

# sudo bluetoothctl
[bluetooth] devices
[bluetooth] trust [MAC of mouse]
[bluetooth] pair [MAC of mouse]

Then the prompt will change to [Arc Touch BT Mouse] and you can run info to show the current status.

[Arc Touch BT Mouse]# info
Device D2:23:01:9B:F3:65
    Name: Arc Touch BT Mouse
    Alias: Arc Touch BT Mouse
    Appearance: 0x03c2
    Icon: input-mouse
    Paired: yes
    Trusted: yes
    Blocked: no
    Connected: yes
    LegacyPairing: no
    UUID: Generic Access Profile    (00001800-0000-1000-8000-00805f9b34fb)
    UUID: Generic Attribute Profile (00001801-0000-1000-8000-00805f9b34fb)
    UUID: Device Information        (0000180a-0000-1000-8000-00805f9b34fb)
    UUID: Battery Service           (0000180f-0000-1000-8000-00805f9b34fb)
    UUID: Human Interface Device    (00001812-0000-1000-8000-00805f9b34fb)
    Modalias: usb:v045Ep0804d0001


It does appear that there is some sort of use timeout period where the connection will disconnect but then activating the mouse by moving it reconnects it, need to investigate if this is some udev rule or setting to prevent the timeout from occuring.


Saturday, August 20, 2011

DS4243 Disk Shelves


  • DS4243 FRUs, connectors, controllers & switches
  • Configuration and tasks related to DS4243
  • Field Replaceable Units (FRUs) related to the DS4243
    • The DS4243 is a FRU in an of itself comprised of multiple other FRUs:
      • Chassis
        • 4U weighing 110lbs
      • PSUs
        • two required for SATA drive shelves
          • two empty slots must contain blank PSUs to provide air flow
        • four required for SAS drive shelves
        • Labeled 1-4 from left to right, top to bottom
      • IOM3
        • input-output modules
        • provide multi-path high availability
        • each IOM3 contains two ACP ports and two SAS ports
      • Ear Covers
        • left and right ear covers; left is for the digital readout of the shelf ID as well as the shelf LEDs.  The shelf-ids are under the right ear cover.
        • hidden by left ear cover is shelf ID switch, must power cycle shelf to have new shelf ID take effect
      • HDDs
        • supports 24 disk drives, labeled 0-24 from top left to bottom right\
        • SAS or SATA but not mixed, however different shelves in same stack can be mixed, use blank panels for empty slots to ensure proper air flow
      • SAS HBAs (controller)
        • quad-port PCIe
          • in 3000+ series FAS/V models
        • dual-port PCIe mini
          • only supported in FAAS2050 controller
      • QSFP Copper Cables
        • used to connect SAS HBAs to controller or other disk shelves IOM3 SAS PCIe controller
      • Ethernet (ACP) Cables
        • extra shielding for increased EMI performance
        • from IOM3 to IOM3 or IOM3 to controller
      • Cable connectors
        • QSFP & 10/100
          • Quad-small-form factor pluggable connectors are keyed, and should be gently inserted into IOM3 SAS ports.
          • 10/100 connect 10Mbps/100Mbps provide ACP Ethernet network
        • two 10/100 and two QSFP on each IOM3
          • QSFP is 4-wide meaning 4 lanes per port times 3Gbps for max of 12Gbps
      • LEDs
        • 51 per chassis
        • 3 shelf LEDs and 48 disk drive LEDs
          • Disk LEDs:  Power (top) Fault (bottom)
          • shelf LEDs: Power (top), Fault (middle), Activity (bottom)
        • PSU LEDs
          • each PSU has 4 LEDs:
            • AC Fail, PCM OK, Fan Fail, DC Fail
        • IOM3 LEDs
          • each IOM3 has 7 LEDs: IOM fault, Ethernet Activity & Ethernet Up/Down, then SAS activity

Friday, August 19, 2011

ASUP options for NCEP baseline filer


Autosupport (ASUP) settings for AWIPS NetApp FAS3160A:

CCC = siteID (e.g. NHCN)
RR = AWIPS DNS zone (e.g. sr)
XX = Site LAN subnet

AutoSupport option
AWIPS Setting
autosupport.cifs.verbose
off
autosupport.content
minimal
autosupport.doit
DONT
autosupport.enable
on
autosupport.from*
nas[1|2]-CCC@RR.awips.noaa.gov
autosupport.local.nht_data.enable
on
autosupport.local.performance_data.enable
off
autosupport.mailhost**
165.92.XX.[5|6]
autosupport.minimal.subject.id
hostname
autosupport.nht_data.enable
on
autosupport.noteto
BLANK
autosupport.partner.to
BLANK
autosupport.performance_data.enable
off
autosupport.retry.count
15
autosupport.retry.interval
4m
autosupport.support.enable
off
autosupport.support.proxy
BLANK
autosupport.support.to
autosupport@netapp.com
autosupport.support.transport
smtp
autosupport.support.url
support.netapp.com/asupprod/post/1.0/postAsup
autosupport.throttle
on
autosupport.to***
netapp@dx[5|6].RR.awips.noaa.gov

*autosupport.from is the FQDN of the controller
**nas1 smtp mailhost is dx5; nas2 smtp mailhost is dx6
***autosupport.to is user netapp at FQDN of mailhost

Replacing a failed disk on a system without spares with a non-zeroed disk from another system

Sometimes we have to add disks from another system, or disks that are non-zeroed spares, to replaced a failed disk in a system either without a spare or with more than one failed in a single aggregate.


This is fairly simple, first, your replacement disk needs to be of the same speed as the ones contained in the aggregate.  DataONTAP doesn't allow mixed speeds for performance reasons by default.  You can change the setting to allow mix-speed drives in the same aggregate by changing raid.rpm.fcal.enable to allow mixed but this isn't recommended.


First replace the failed disk with the replacement disk.


If the replacement disk already is a member of a foreign volume or aggregate it will show up appended with a "(1)", for example, if your replacement disk is from a system's vol0 it will show up in the new system as vol0(1) and will be brought in "offline" as noted as "foreign."  This is helpful for the next step.


Locate the new disk in the system, if it is a member of a foreign volume or aggregate list it with:  vol status or aggr status and note the volume or aggregate name it is a member of, e.g. vol0(1).


rsyncnas-ancf> vol status

         Volume State      Status            Options
       vol0(1) offline     raid_dp, flex     fs_size_fixed=on
            foreign        degraded
           vol0 online     raid_dp, flex     root
            irt online     raid_dp, flex     nosnapdir=on, fs_size_fixed=on
           ndfd online     raid_dp, flex     nosnapdir=on, fs_size_fixed=on


First you need to destroy the volume the new disk was previously a member of,


> vol destroy vol0(1)

BE VERY CAREFUL HERE ... double check to make sure that you enter the foreign volume and not the system's root volume.  


Once the volume is destroyed, the disk will now become a spare, you can list the spares on the system via:  vol status -s or aggr status -s.  At this point you will need to zero the spare disk with:


> disk zero spares


Once the disk zeroes it is ready to be readded to the system.  Assuming previously you did not have a spare.  If you lost a dparty disk, the RAID will have degraded down from a raid_dp to raid4 via DataONTAP automechanisms.  You can change the RAID on that aggregate from which the failed disk came via:


> aggr options aggr1 raidtype raid_dp


DataONTAP will automatically begin rebuilding/reconstructing the RAID with the spare you just added.


If a disk failed when you were in the process of failing over the resources from one controller to another, and you failed back before the disk was reaplced, sometimes DataONTAP will assign that new spare to the controller that originally owned the failed disk, but the reconstruct will have occurred on the partner node when the cluster was in takeover mode.  Therefore, the partner node will have one fewer spares, but when you insert the new disk it will be assigned to the node to which the original failed disk belong, and so that partner will have one too many spares.  


To reassign a spare disk to a partner controller:


> disk assign 1a.10.11 -o nas2-nhcn -f


Assuming 1a.10.11 is a spare and the partner nodename is nas2-nhcn.  You can use the partner systemID with the -s option.

Monday, August 8, 2011

SAN Implementation Storage System Configuration

FC and IP SAN Storage System Configuration Basics

FC SAN:

[1] Verify SAN topology:

fcp config command used to ensure that the ports are configured according to the requirements decided upon during the SAN design phase.  If onboard FC ports are being used, the FC port behavior may have to be set.  If the FC ports connect to disk shelves, the port type needs to be set to FC Initiator while if they connect to the SAN fabric or the host, the behavior needs to be set to FC target.

[2] Enable FCP protocol:

First run the license command to see the available licenses.  You can use license add to add subsequent licenses.  To enable the FCP use fcp start command.  Finally, to verify the status use fcp status command.

[3] Port behvior set:

The adapter must be taken offline.  Use fcp config command followed by the adapter name and the down option.  To set the behavior of the FC ports use the fcadmin config -t [initiator|target]  command.  In order for changes to take affect the system must be rebooted.


[4] cfmode checked and set:

To check which cfmode is being used use the fcp show cfmode command.  There are several things to consider, [1] storage system, [2] host operating system, [3] DataONTAP version, [4] topology features.  There are 4 cfmodes:  [1] Single_Image; [2] partner; [3] dual fabric; [4] stand by.  The Single_Image is the default and recommended for DataONTAP 7.2+ systems.

To set cfmode, the privilge level commands need to be accessed:  priv set advanced
Then the FCP must be stopped: fcp stop
Next use the fcp set cfmode command to change modes
Then restart the FCP with fcp start


[5] WWPN:

To recorde the FC WWPN use:  fcp show adapters and fcp show nodename commands


[6] Check port configuration:

Verify the FC ports are online and speed and media type are correct, use the DataONTAP fcp config commands to verify these values.  By default they are set to AUTO for autonegotiate.  You can manually set the type if the FC switch port or host cannot autoneg (rare).


[7] Create aggr/vol:

Create appropriate aggr and volumes, ensure sizes of volumes is large enough for all LUNs and snapshots if using.  Also, the Unicode format must be enabled for volumes containing LUNs: vol options create_unicode on 




iSCSI SAN:


[1] Check Ethernet interfaces:

To bring up the interfaces, use ifconfig -a to view available interfaces.  Run the ifconfig up|down to bring up the interfaces.  Ensure proper configuration with IPADDR and NETMASK, ensure speed set to autoneg or to same speed as network.  If these changes need to be persisted across reboots, the /etc/rc and /etc/hosts files need to be updated.

[2] License and enable iSCSI:

Run the license command to see available package licenses.  Run the iscsi start, iscsi status commands to start and verify the startup of iSCSI on the filer.

[3] Configure iSCSI Ethernet interfaces:

Now check that iSCSI traffic has been enable of ethernet interfaces.  These interfaces can be up, but disabled on Ethernet traffic.  It is recommended to separate iSCSI traffic be separated from general TCP/IP traffic on Ethernet connections.  Thus it is encourage to disable iSCSI traffic on e0m if being used for RLM.  Also disable on other ports used for general TCP/IP traffic and enable on those Ethernet interfaces dedicated for iSCSI traffic.  Have host-side iSCSI initiators and storage-side iSCSI targets connected to a separate network is best practice.

Run the iscsi interface enable  command and speficy the interface on which iSCSI traffic will use.  Run the iscsi interface show command to see if iSCSI traffic has been enabled on the correct interface(s).

[4] Verify target portal groups:

Verify interfaces are assigned to valid iSCSI Target Portal Group (TPG).  By default DataONTAP assigns each iSCSI interface to its own default TPG to allow multiple iSCSI paths to each LUN.  To view available iSCSI TPGs and which iSCSI target interfaces are assigned to each TPG use iscsi tpgroup show.  Then to create a target portal group, run the iscsi tpgroup create .

[5] Create aggr/vols:


Finally create aggr/vols similarly to above FC SAN creation.

NetApp -- Flash Cache

FLASH CACHE BASICS

PCIe expansion card used for scalabe read cache in NetApp storage systems
Enables a disk limited disk storage system to achieve it's max I/O potential, using fewer disks to achieve the maximum I/O thus uses less resources (power, rack space, money).
Helps achieve lower read latency due to faster access times of the solid state memory
Flash cache hits reduce latency by a factor of 10 for reads

Specifications:
Standard height, 3/4" x8 PCIe card
72 NAND flash chips, 36 on each side of the card, the density is different for two different sized cards, the 256GB version has 72 32-Gb flash chips while the 512-GB version has 64-Gb flash chips.
Each card consumes a single PCIe Gen1 connection
256-GB & 512-GB cards are supported by 7.3.2+ DataONTAP (NOTE not supported in 8.0, only 8.0.1+)

Under the aluminum heat sink ... in the center of the card there is a custom design controller on the PCI bracket at the end of the flash cache card.

Indicators:
Two LEDs are located on the cards -- you can see the LEDs from the back on the controller through the perforated PCIe cover.

The Amber LED should be off under normal operation, if it is on there is a problem with the card, and it is taken off-line when a fault is detected.
The Green LED indicates activity, and provides a heartbeat indicate blinking at 0.5Hz, and the blink rates are based on the I/O rate of the card as follows:  0.5Hz (< 1,000 I/O per second); 1.4Hz (1,000-10,000); 5.0Hz (10,000 - 100,000); 10.0Hz (>100,000).

Power:

Draws all required power from the 12V rail on PCI connector
18W power consumption which is below the 25W max consumption required by all PCIe supported platforms
10C-40C ambient temperature operating environment, lower than most PCIe components
Improved air-flow allows memory components to operate under lower ambient temperatures
95% less electricity than a shelf of 14 10,000RPM disk shelves (which is what card allows to be eliminated)

Flash Cache Field Programmable Gate Array (FPGA)

One x8 PCIe 1.1 Interface
DMA access engine
Four independent 72bit async NAND interfaces with 18 flash devices
The flash data interfaces that connect to the flash devices are capable of running at 40MHz so the raw bandwidth of the card is 1.28GB/s so when one is busy another can take up the reads
An interfaces can operate on 9 flash devices in parallel at a time
Each of 18 flash devices on interfaces contains multiple 8GB NAND cores
256GB has 288 cores, 4 cores/device
512GB has 576 cores, 8 cores/device


FPGA Low Level Specifics:
Each NAND core is made up of blocks and cache wears out in increments of blocks
Each block contains pages, and pages are units of storage that data can be written into and read from
Across 9 parallel cores, 8 cores are for data, one if for parity (this is a bank)
8 banks for 256GB
If a NAND core losses too many blocks, it can be taken out of use without functional distruption
A DMA engine supports one Write and Erase queue per flash interfaces
Multithreaded DMA engines ensure non-volatile, supports 8 read queues for each interface
DMA engine supports 520byte sectors, the flash controller controls the write operates to write memory and informs the flash controller of any issues
If a WAFL from flash cache fails because of an uncorrectable BCH error in flash memory than the data is fetched from disk
Flash memory contents are protected by 4bit BCH codes
If a core fails the card continues to operate without any loss of capacity using parity to reconstruct data
If an entire bank of cores fails, the card continues to work with a reduced capacity

The Dynamic interrupt mechanism speeds up or slows down to met host processing rate and upgradable from backup image and can match the power or thermal limits of the platform

Flash Cache FPGA Enhanced Resiliency
Wear Leveling: uses algorithms to ensure each block receives equal amount of wear
Bad Block Detection and Remapping: FPGA monitors and IDs worn out blocks, failed blocks replaced by FPGA
BCH Error Correction Engine: soft errors during reads handled here
Protection RAID: 8 data and 1 parity chips, can tolerate the loss of an entire chip
Dynamic Flash Remapping and Reduction: if two chips from same bank are loss, software can map out that region of memory, when significant portion is loss, ASUP message generated
WAFL Checksums: additionally software stores chksum with every WAFL block, if fails on read data from flash cache is discarded and data is obtained then from disk

Flash Cache Subsystem:

WAFL:  helps reduce the demand for random disk reads by reading user data and metadata from the external cache, it interfaces with the WAFL filesystem, and then controls and tracks the cache state.

WAFL External Cache (EC): is a software module that is used to cache WAFL data in external memory cards.  The EC can be used with either PAM1 or Flash Caches (>7.3.2).  Also supports Predictive Cache Statistics (PCS).  Contains three flow control processes: primary cache eviction, cache lookup, and I/O completion.  Per 0.5TB of Flash Cache card, 1.5 to 2.0 GB are preallocated for tag storage in the storage system main memory.

Flash Adaptation Layer (FAL):  is responsible for mapping a simple block address space on to one or more Flash Cache cards.  The FAL can manage cache writes in a way that produces excellent wear leveling, load balancing, and throughput while minimizing read variance that is caused by resource conflicts.  The FAL transparently implements bad block mapping, which gradually reduces flash capacity as flash blocks wear out.  Models flash memory as a single circular log across all blocks on cache.  Blocks must be erased before overwritten.  The number of erasures are limited, therefore wear leveling is important.  Round-robin scheduling of writes.  Reads and writes passed on to the Flash Cache driver.  Achieves wear leveling by placing EC writes in a circular log within a bank.

Flash Cache Driver (FCD):  manages all comms with the Flash Cache hardware, including request queues, interrupts, fault handling, initialization and FPGA.  Manages all Flash Cache cards, multiple cards are aggregated behind this FCD interface.  Provides memory unification, load balancing, and queuing across all cards.  Communicates through EMS by issuing messages for hardware status and error messages.  Automatically enabled when hardware detected.

Flash Cache Hardware:  The card itself.


Bad Blocks:
Two copies, bad block discovery table, not stored in each flash block, ensure only one bad block table erased at a time, on power-up driver goes through discovery, since table is kept, initial power-up time reduced.

Flash Management Module
operates at a higher level, viewing the components of a Flash Cache as domains.
These domains are interfaces, flash banks, lanes, blocks, and cores.
FMM assists in maintaining availability and providing serviceability.  These aspects are monitored by the FMM when the storage system boots up.  The FMM begins running when the storage system boots up and immediately begins discovering flash devices.  FMM is enabled by default in the Data ONTAP operating system.  When a flash device, such as Flash Cache, is discovered, the driver of the flash device registers it with the FMM for reliability, availability, and serviceability (RAS).

TROUBLESHOOTING, INSTALLING, DIAGNOSTICS

Shut down controller
Open storage system
Remove an exisiting module if necessary
install flash cache card
close and boot system
run diagnostics on the new flash cache card (for first time install)
(also enable WAFL EC software and configuration options for first time install)
complete the installation process

Enable WAFL external cache software liscense:   license ass
Enable WAFL external cache software: options flexscale.enable
If AA perform on both systems

Run sysconfig -v to show slots in which cache is installed.  Three states, "Enabled|Disabled|Failed".  Further details of the failed state may also be listed if the state is failed, e.g. "Failed Firmware".

WAFL EC Config Options:

Cache normal user data blocks
Cache low-priority user data blocks
Cache only system metadata
To integrate the FlexShare QoS tool's buffer cache policies with WAFL external cache options use the priority command.

Default Flash Cache configuration:
options flexscale
flexscale.enable on
flexscale.lopri_blocks off << recommended to turn this on
flexscale.normal_data_blocks on


Note, when caching normal data, until the Flash Cache card is 70% full, all the options of caching are turned on and after the Flash Cache card is 70% full, the set configurations are identified and used.  In the Flash Cache default caching mode, a block is cached when it is evicted from the main memory cache.  When the data is accessed at a later time, the data is obtained from the Flash Cache card, which is larger than the main memory.  In this mode, Flash Cache acts like the main memory.

Flash Cache caches the file data and metadata.  Metadata is not displayed as an option for the options command because metadata is cached instantly.  Metadata is the data that is used to maintain the file-level data structure and directory structure for NFS and CIFS data.  In the default mode, Flash Cache also caches normal data, which primarily consists of the random reads.

The recommended configuration for Flash Cache is to turn on caching for normal data blocks and for low-priority data, which includes random reads and some of the writes.

Predictive Cache Statistics:

Uses sampling approach instead of using an entirely new EC tag store which PAM1 does.
Sample and only allocated and updates the sampled portion of the tagstore and thus reduces CPU and memory usage.  Sampling rate in 1% and 2%, the default is 1%.

options flexscale
flexscale.enable pcs
flexscale.pcs_high_res off <<< turn on to use 2%
flexscale.pcs_size 1024GB <<< can change to test if more cache would help

LED Notifications:
Fault LED amber off // Green LED blinking -- NORMAL
Fault LED amber on // Green LED blinking -- hardware OK, but software taken off-line
Fault LED amber on // Green LED solid -- unknown problem, hardware problem, may need replaced
Fault LED amber off // Green LED off -- hardware problem with power supply -- replace
Fault LED amber on // Green LED off -- hardware problem with FPGA config -- replace

Use sysconfig -v, EMS logs, and stats show ext_cache_obj command to display the flash cards that the Flash Cache is using and the blocks that can be stored for each Flash Cache card.  Each card can store 135 billion 4k blocks.

FMM generates ASUP email notification messages.  ASUP needs to be enabled on the storage system, /etc/log/fmm_data is where information and settings are stored.  Case types are DEGRADED, OFFLINED, FAILED.


Saturday, July 23, 2011

Troubleshooting external connections in AWIPS using ADAM in two parts, Part 1: Network Setup

This is to help aid in troubleshooting connections from AWIPS to the external site local LAN (e.g. from ADAM workstations out to the NWS Collaborative Web Server webpage).  While using the ADAM - NWS Collaborate as an example, these steps can be used for any sort of troubleshooting between a device and the firewall out via the LLSW to the regional routers and into cyberspace.

Basic topology setup:
 

  • ADAM1 = ADAM workstations
  • HSW# = AWIPS High-Speed Switches (Cisco 2960g6)
  • ADAM SW = ADAM shared media hub (5-port 10/100/1000 "pocket" switch)
  • DVB# = Novra S300 Digital Video Broadcast receivers

There are two network connections on ADAM, one to the AWIPS LAN, one to the DVB as follows:

  1. [ADAM1-CCC] --- [HSW2, g0/43] 
  2. [ADAM1-CCC] --- [ADAM SW] --- [DVB2]


The way the switch interconnects with the ADAM and the firewall is a little more complicated.

The ADAM1 connects to HSW2,g0/43 (GigabitEthernet switchport 43 on module 1).

HSW2 and HSW1 are combined into a sort of  for Rapid-Spanning Tree Protocol.

  • Spanning-Tree Protocol allows for multiple paths from a single device to be connected to a switch without any adverse side-effects like bridge loops or multiple paths detected to a single device.
RSTP defined a "root bridge" out of which all traffic will go, since ADAM is connected to HSW2, and the root bridge as defined in the configuration is HSW1, incoming traffic from ADAM destine for device on the switch will go out HSW1 (under normal configuration and setup).  So, for ADAM to talk to the firewall we have:

[ADAM1, eth0] --- [[ HSW2, g0/43] --- [HSW2, g0/47 ]] --- [HSW1, g0/21] --- [[ FW/SW1, port1] --- [FW/SW1, port 2 ]] --- [FW1, eth0/1]

That's a lot of hops, but it basically says that the ADAM network interface eth0 connects to the switches which sends the traffic out to the firewall.  Between the HSW and the firewall (FW) there is another shared-media hub (5-port "pocket" switch) for high-availability of the firewall cluster, that is what FW/SW1 is.  Port 1 connects from AWIPS, port 2 connects to the firewall.  There are two FW/SW switches in the LDAD rack, each firewall has a single connection from the FW/eth0,1 port to each FW/SW --- should the firewall highly available service move between FW1 or FW2.  Note that the high-available package for the firewalls is named GW1, the actual machine is named FW1 or FW2.  There is no GW2.

Once a connection is made the the GW1 (assumed running in the above topology diagram on FW1) should the request destination address from the ADAM be an external IPADDR like the NWS Collaborative Web Server (e.g. 140.90.75.234) the routing on the firewall will direct the request, coming in on eth0,1 to eth0,2 or the "Untrust" network connection.  The other end of FW1, eth0,2 goes to the LLSW, or the Local LAN SWitch.  There are usually four connections in the LLSW, one to the site LAN which goes out to the regional routers and then cyberspace; one to the LS2 and one to the LS3; and one to the LTS (LDAD Terminal Server).  There may also be a VIR connection.

So the topology from the Firewall to cyberspace would be:

[[GW1, eth0,1] --- [GW1, eth0,2]] --- [LLSW] --- PATCH PANEL, or SITE LAN connection


All we care about here is the connection to the LLSW, which is a Tiger 24port Switch for WFO/RFCs and a Edge-corE ES4528V 28port switch for NCEP.  You can only connect to the LLSW via the LTS, which is only accessible via the LS2 or LS3 through the LLSW LAN connections.

Putting it all together:


A request generated from the ADAM workstation to 140.90.75.234 will follow this path, assuming all software and firewall configurations are setup properly (discussed in part 2):

[ADAM1, eth0] --- [[HSW2,g0/43] --- [HSW2,g0/47]] --- [HSW1,g0/47] -- [HSW1, g0/21] --- [[FW/SW1, port 1] --- [FW/SW1, port2]] --- [[FW1, eth0,1] --- [FW1, eth0,2]] --- [LLSW] --- Patch Panel or Site LAN connection --- cyberspace


While it seems a bit complex it is rather simple, and I made it as detailed as possible, the simple path would be:

  1. ADAM workstation
  2. AWIPS High-Speed Switch
  3. Firewall Switch to the Firewall
  4. Out the Firewall to the Local LAN switch (LLSW)
  5. Out to the internet
In Part Two we will discuss the actual setup required on the devices to allow the connections, and trace all the steps with commands and references.

Saturday, July 9, 2011

Gnome-Shell Musings and Abusings in Fedora 15 (Lovelock)

Gnome-Shell (gnome3)

PART ONE:  A glancing blow & basics ... from my trying to understand the new interface.  Realize I am not a programmer, I am a systems engineer, so I am venturing into territory I wish not to tread too long as I find this stuff extremely esoteric and boring, but it is sort of required to get around with any proficiency as a power-user or someone wanting to customize the desktop before it becomes more user-friendly with GUIs and front-ends (at least nice ones).  I also do not claim to have discovered or identified this information on my own, by perusing lines of code, nor to I claim it to be accurate, simply "as how I currently understand things" ... this is the internet after all, and a blog on top of that, and one that maybe 3-4 people will ever see, but it helps me also remember things to write them and post them for future use.

While it took me some time to adjust, I am starting to customize and use functionally gnome-shell, aka Gnome3, as my Desktop Manager in Fedora 15.  I have been a long-time Gnome user, and even jumped ship when KDE4 first came out to try out the "future of the desktop" but alas came back to my comfort-zone in Gnome.  Now, that Gnome has had it's first major rewrite in, well, ever, I didn't know what to do.  I even (shhh) jumped back to the latest iteration of the KDE desktop but remember quickly why I hate KDE.  So here I am, about 4 days into gnome-shell, two jumps over to KDE, and a jump back over toe Gnome Classic with Compiz, and I am forcing myself to adjust.  It's rough, especially since I use my laptop for work and don't need the user environment to "get in the way" of productivity, but it is starting to, dare I say, make some semblance of sense.

Some of my issues with gnome-shell, some of which may be cutomizable, I list here ... note that the ability to easily customize just isn't there yet and requires either command line launching of "hidden" configuration editors or not widely advertised conf editors.  I will try to talk about these as well.


  1. The resizing of windows is very sensitive and only a small area on the window border is active to show the resize cursor.  For those of us without surgically precise hand movements, it makes like a game of Operation, constantly having to carefully maneuver to avoid losing the resizing cursor.
  2. The lack of a minimize button on windows, only close and maximize.  While this is part of the philosophy behind gnome-shell (not needing to keep minimizing windows, simply move them to a new workspace) it is a very hard habit to break.
  3. Due to #2's philosophy of using multiple workspaces, it makes cut & paste cumbersome.  While before it was as simple as highlight and double-mouse-button click, now you have to highlight, move over to the Activities tab, move over to the workspace switcher, find the application you wish to paste into, right-click once to choose the workspace, right-click again to move into the workspace, then double-mouse-button click to paste.  You go from two steps and no workspace changes to seven total steps.
  4. Lack of easily customizing the desktop, while I fully expect this to improve with time, there isn't a built-in easy well-advertised and documented way to customize the desktop.  I will talk more about this now.
The basics of Gnome desktop management remain unchanged, albeit renamed and reworked.  For example, Metacity, the long-lived windows manager is now named Mutter however it incorporates a branch of Metacity.  Mutter is built on Clutter - everyone has to be cute - which is a "scene graph" (basically a library) built on openGL (or openGLes for mobile or embedded devices - ahh the unification of platforms is afoot).  Think of openGL as the beams of a floor in your house, Clutter would be the subfloor built on top of the beams, and Mutter is the hardwood or tile or carpet put down on that which you see, walk-on (use).

  • openGL/openGLes -- a set of rules or specifications that software programs can follow to communicate with each other to produce 2D and 3D graphics, or more technically a "cross-platform, cross-language API for 2D and 2D rendering."

    If gnome-shell were a language, openGL would be the alphabet.
  • Clutter & Mutter -- similar to openGL but it is built on openGL, and further extrapolates to now define and produce the graphical user interface.  Mutter is the desktop manager, and it is built off a branch of Metacity, the legacy desktop manager of Gnome.

    Mutter is sort of the vocabulary or vernacular of a language, that define a set of standards and guidance for spoken and written word.  Without openGL (the alphabet) you can't actually create spoken or written word but without a vocabulary or vernacular you cannot express ideas with others, or interface
    .
  • Shell Toolkit:  is implemented on top of Clutter, and it provides further base shapes, actions, items such as scrollbars, windows, etc.  Think of the toolkit as a "lexical category" or more commonly known as a "part of speech," e.g. noun, pronoun, adjective.  It builds out further from the vocabulary a more detailed set of standards most commonly used.  The shell toolkit also supports CSS (Cascading Style Sheets) which is even further customization of the look and feel of the desktop and is a widely used standard in webpages but can also be applied (as it is here) to XML.
  • GObject:  not going to go too deep into this, but think of it as sort of a C++ or Objective-C library or alternative.  It provides object oriented programming that doesn't introduce a new compiler, so it is portable and cross-platform, platform independent, not introducing a new compiler or syntax like C++ does.  This would sort of be like those words from other languages incorporated into another, like Hors d'Ĺ“uvre.
  • JavaScript Engine:  basically what it says, it provides a way to execute and interpret JS code ... the package is named GJS (Gnome JavaScript) in keeping with the KJS engine introduced back when KDE4 was released.  I guess, trying to keep with my analogy, this would be your brain.
  • Extensions:  provides a way to allow customization of the desktop without having to patch or submit the code to the project for incorporation upstream.  It simply loads and executes JS or CSS code.  This is a new way of thinking for Gnome, and for users.  Everything utilizes these extensions to manipulate the GJS, something as simple as removing an icon from the "system tray" requires an extension.  Perhaps this would be akin to local dialect or slang that becomes so widely used it is a part of the base lexicon ... like yo, tweet, or "buddy me".

    Gnome provides some nice, albeit sparse, documentation here, from where I grabbed the picture of the architecture shown above this section: http://live.gnome.org/GnomeShell/Technology

    I guess next I will talk some about the ways to configure the desktop with extensions, config editors, looking-glass (the GJS debugger).

Thursday, July 7, 2011

Laughlin to Lovelock -- Driving tour of Nevada cities or Fedora distributions

OK so here we go, attempting a preupgrade of Laughlin to Lovelock. This isn't meant to be anything other than my attempt(s) and while some things might have already been noted in other places or in release-notes, like a good geek I am going at this blind and troubleshooting as I go. I did basically just a yum update, followed by a yum install preupgrade, and then ran from the terminal preupgrade and followed the "ahem" GUI along to start the process.


Checking for new repos for mirrors
* preupgrade-updates: mirror.umoss.org
unknown metadata being downloaded: metalink.xml.tmp
unknown metadata being downloaded: repomdcGv3ZLtmp.xml
unknown metadata being downloaded: MEMORY
Fetched treeinfo from http://mirror.metrocast.net/fedora/linux/releases/15/Fedora/x86_64/os//.treeinfo
treeinfo timestamp: Fri May 13 15:44:30 2011
unknown metadata being downloaded: MEMORY

This appears to be caused by the adobe repository, removing or disabling the repo allowed the "Downloading filelist metadata..." portion to continue, I was able to do this dynamically, without having to stop and restart the process. You should probably check out which repos you have enabled, listed in the yum.repos.d directory and disable/enable any accordingly (in conjunction with the following package pruning suggestion as well).
Checking for new repos for mirrors
* preupgrade-updates-testing: mirror.symnds.com
Downloading 1.4GB
Available disk space for /var/cache/yum/preupgrade: 12.1GB

As an FYI, appears a requirement of about 1.5GB of space in /var in order to download the required packages. This will obviously differ according to the number of packages you have installed. You should probably prune any unwanted packages or software prior to the preupgrade process especially if you do not have the required space in /var. I do have a lot of bloat, but I have the space.

Should also note, Eclipse took about 5-8 minutes to download during the process, to save time, might want to uninstall and reinstall later since I am going to have to remake the workspaces anyway.

So adobeair ended up being a game changer, which sucked ... on top of the repo issue noted above, when the install started, after the downloads, it failed on adobeair, and so the install failed.  Rebooted into an older kernel (Fedora 14) and was able to remove adobeair RPM altogether.  Then reboot and kickoff the Lovelock Kickstart again.  It completed, but alas, had a kernel panic.  Back to F14 (thank god for rescue kernels kept on the system) and I removed the kernel-2.6.38.8-13.fc15.x86_64 and then reinstalled it by yum --releasever=15 install kernel-2.6.38.8-13.fc15.x86_64.

Then, both fedora-release RPMs were installed, one for 14-1 and one for 15-3 ... but, the files all read for 15, all the /etc/redhat-release, /etc/fedora-release, blah blah (rpm -ql ${PACKAGE_NAME} will show files installed by packages) ... anyway, I tried to figure it out, but just removed the 14-1 and it was fine.  What happened was the cache kept pointing to 14 for yum, yum clean all, yum removecache, all the goodies didn't work ... figured removing the fedora-release-14.1 would fix it, but the rescue (saved) kernels are still fc14, and while I don't plan on going back and running/updating in them, it's kind of annoying.

Well, enjoying Gnome-Shell (aka Gnome 3) now ... will see how long I can stand this new environment .. it still isn't as bad as KDE.

Monday, June 27, 2011

Additional SCSI commands for multipath

Some additional commands for multipath and SCSI devices:

To remove a single SCSI device from the partition table:

# echo "scsi remove-single-device 2 0 0 2" > /proc/scsi/scsi

assuming the SCSI ID is 2:0:0:2 which you can obtain from the output of multipath -ll or cat /proc/scsi/scsi among other places, note that the values reference:

HOST:CHANNEL:ID:LUN

To add it back simply change the remove to add in the command above.

You can also rescan the SCSI device, via:

# echo 1 > /sys/class/scsi_device/2\:0\:0\:2/device/rescan

Most of these however will not rescan the multipath daemon to "see" the new sizes. These were among various "options" I saw online, all of which won't really work, but I present here just for reference.

Dynamically Resize Multipath SCSI devices

I've seen quite a few "ways" on the web to resize the multipath SCSI devices that are created via the Gnu/Linux multipathd service, none of which actually did what they claimed or would ultimately result in downtime or unavailability of the multipathed physical storage. Here is what I did, with some addition comments, to get a NetApp filer LUN resized and then the larger storage available to the hosts that direct-connect (via LC/LC optical cables and a Qlogic HBA) to linux hosts running RHEL5u5.

SETUP

  • 300GB /vol/vol_aiihdf5/base-lun on NetApp 3160A filer
  • RHEL5u5 host, dual-single port 8GB QLogic HBA
RESIZING THE STORAGE
  1. log into the primary controller for the physical storage location

  2. assuming available space remaining on the aggregate on which the volume exists:

    # vol size awipsiihdf5 +170g

    The above increases the volume named awipsiihdf5 by 170GB. Note this is a delta increase, the exisiting size is 400GB, so the increase will set the size to 570GB.

  3. Resize the pre-exisiting LUN created under the volume:

    # lun resize /vol/vol_awipsiihdf5/base-lun 570g

  4. You can check that the resize was successful by issuing a lun show or other commands in the CLI on the filer, or you can use the NetApp Network Manager or Filer View interface instead. Regardless, resizing the physical storage on the filer is the first step in the process.


RESIZING THE MULTIPATH SCSI DEVICES

  1. While you aren't really "resizing" the SCSI devices, you are re-registering them to "see" the new underlying physical storage increase on the linux hosts. First, check that the HBA correctly "sees" the newly resize physical storage on the filer. Assuming that you have some CLI interface on the linux host such as santools by NetApp or scli from QLogic.

    # sanlun lun show -p dx1-das:/vol/vol/awipsiihdf5/base-lun

    dx1-das:/vol/awipsiihdf5/base-lun (LUN 3) Lun state: GOOD
    Lun Size: 570.0g (609934639104) Controller_CF_State: Cluster Enabled
    Protocol: FCP Controller Partner: dx2-das
    DM-MP DevName: mpath4 (360a98000503361384e6f624e41594c59) dm-14
    Multipath-provider: NATIVE
    --------- ---------- ------- ------------ --------------------------------------------- ---------------
    sanlun Controller Primary Partner
    path Path /dev/ Host Controller Controller
    state type node HBA port port
    --------- ---------- ------- ------------ --------------------------------------------- ---------------
    GOOD primary sdd host2 0a --
    GOOD secondary sdh host3 -- 0a
    What we are are looking for is acknowledgement on the linux host that the HBA recognizes the new size of the LUN, indeed above the output shows 570.0g, which is what we resized to in previous steps.

    Should the new sizes not be seen, you may want to rescan the host bus, for example:

    # echo 1 > /sys/class/fc_host/host2/issue_lip
    # echo 1 > /sys/class/fc_host/host3/issue_lip

    This will cause the kernel to rescan the connections on the fibre channel host BUS adapters connected to PCI2 and PCI3 and hopefully relearn the sizes of the physical storage.

  2. Check the multipath, it should show the original, unresized, value for the path. Note the mpath identifier is mpath4 from the above output.

    # multipath -ll mpath4

    mpath4 (360a98000503361384e6f624e41594c59) dm-14 NETAPP,LUN
    [size=400G][features=1 queue_if_no_path][hwhandler=0][rw]
    \_ round-robin 0 [prio=4][active]
    \_ 2:0:0:3 sdd 8:48 [active][ready]
    \_ round-robin 0 [prio=1][enabled]
    \_ 3:0:0:3 sdh 8:112 [active][ready

    Indeed, multipath still shows the size of 400G. For future steps, note the two SCSI device IDs assigned by multipath to the fibre connections to the filer, above it is sdd (SCSI ID 2:0:0:3) with Major:Minor number 8:48 and sdh (SCSI ID 3:0:0:3) with Major:Minor 8:112.

  3. This is where I see a myriad of suggestions from editing the device-mapper table manually, to removing the SCSI devices from the system and readding them manually. These are rather drastic steps considering multipath provides a nice CLI into the daemon, multipathd -k.

    Before continuing however, save off the exisiting device-mapper table in case we muck things up in the process, we can restore the table via dmsetup [suspend|reload|resume].

    # dmsetup table > /root/dmsetup_table.orig

    Should we muck things up, and need to reload the table, you would issue the below, but do not do this now, this is only an FYI:

    # dmsetup suspend; dmsetup reload /root/dmsetup_table.orig; dmsetup resume

    OK, back to resizing, once saving off the device-mapper table, check the current block device sizes of the SCSI devices assigned by multipath, remember above we noted sdd and sdh from the output of multipath -ll mpath4:

    # blockdev --getsize /dev/sdd
    836849664

    Blockdev allows you to call device ioctls from the command line, the --getsize options outputs the block size in 512-byte increments. One byte is 9.31322575 X 10^-10 GB in case you want to verify, indeed the current size of the block device assigned by multipath to the physical storage is 400GB (note: there may be some discrepancies by about 1 GB due to conversions and overhead).

    Reread the device from the partition table:

    # blockdev --rereadpt /dev/sdd

    And issue a getsize again:

    # blockdev -- getsize /dev/sdd
    1191278592

    Indeed now, the device is "seen" as 570GB. However, only one of the two paths has been reread, remember there are at least two paths (or SCSI devices assigned) when using multipath (hence the name). There could be more than two paths, but at minimum, and in this case, we have two. Reread the second (and subsequent) SCSI devices:

    # blockdev --getsize /dev/sdh
    836849664

    # blockdev --rereadpt /dev/sdh

    # blockdev -- getsize /dev/sdh
    1191278592

    OK, so now both SCSI devices in the multipath are changed, in the kernel layer. Multipath however still has not been updated to the new sizes. We now need to re-register multipath using the CLI.

    # multipathd -k

    This will get you into the multipath interface, simply resize the multipath ID assigned to the LUN, in our case it was mpath4:

    multipathd> resize multipath mpath4
    OK

    That's it, now reissue a scan on the multipath and you should now see the larger size:

    # multipath -ll mpath4


    mpath4 (360a98000503361384e6f624e41594c59) dm-14 NETAPP,LUN
    [size=568G][features=1 queue_if_no_path][hwhandler=0][rw]
    \_ round-robin 0 [prio=4][enabled]
    \_ 2:0:0:3 sdd 8:48 [active][ready]
    \_ round-robin 0 [prio=1][enabled]
    \_ 3:0:0:3 sdh 8:112 [active][ready]

    That's all fine-and-dandy, but the filesystem we created on this storage deivce path won't be resized until we then resize it. We created a physical volume out of the device multipath (/dev/dm-14) and use LVM. Note /dev/dm-14 is sort of an all-incompasing device map to the storage location. You can theoretically use the single SCSI block devices, e.g /dev/sdd or /dev/sdh, but if you only use one, the should that path fail, you will lose connectivity to your storage. Always use the device-multipath device name.

    Here is a quick, "what to do" should you have to resize a LVM setup on the physical storage, after this, you will have a 170GB larger filesystem to use and abuse until management tells you to increase it again instead of fixing the problems related to the software not purging the directory ;-)

    # pvscan | grep hdf5
    PV /dev/dm-14 VG vg_aiihdf5 lvm2 [399.04 GB / 40.00 MB free]

    # pvresize /dev/dm-14
    Physical volume "/dev/dm-14" changed
    1 physical volume(s) resized / 0 physical volume(s) not resized


    # pvscan | grep hdf5
    PV /dev/dm-14 VG vg_aiihdf5 lvm2 [568.04 GB / 169.04 GB free]



    Ensure volume group was resized during pvresize

    # vgdisplay /dev/vg_aiihdf5 | grep "Size"
    VG Size 568.04 GB
    PE Size 4.00 MB
    Alloc PE / Size 102144 / 399.00 GB
    Free PE / Size 43275 / 169.04 GB



    Resize the logical volume, while online (without having to umount or lvchange). Note this will take a few minutes

    # resize2fs -fp /dev/vg_aiihdf5/awipsiihdf5
    resize2fs 1.39 (29-May-2006)
    Filesystem at /dev/vg_aiihdf5/awipsiihdf5 is mounted on /awips2/edex/data/hdf5; on-line resizing required
    Performing an on-line resize of /dev/vg_aiihdf5/awipsiihdf5 to 148111360 (4k) blocks.
    The filesystem on /dev/vg_aiihdf5/awipsiihdf5 is now 148111360 blocks long.


    Verify the hdf5 partition has been resized:

    # df | grep hdf5
    /dev/mapper/vg_aiihdf5-awipsiihdf5
    583149656 142582728 410946784 26% /awips2/edex/data/hdf5

Saturday, July 5, 2008

XPS LED Control Apps

http://linux.dell.com/libsmbios/download/libsmbios/libsmbios-0.13.4.1_BETA/

download the tar.gz ,

./configure && make && sudo make all
sudo modprobe dcdbas
/sbin/dellLEDCtl -h
/sbin/dellLEDCtl -i
/sbin/dellLEDCtl -z1 1 -z2 2 -z3 2 -l 7

Make sure you add the module via depmod or to the modules in the run command directories depending on your distro.

You can also use this program:

http://sourceforge.net/projects/xpsledchanger/


It uses libsmbios to access the LEDs, but you don't have to work though command line. It's a simple gui that let's you change the color and intensity in just one click.

Currently it only works on Ubuntu. But if you tell me where the dellLEDCtl file is on your distro, I can add it to the code (or you can do it yourself if you know python).

---- Fun with LEDs ----

I created an hourly cronjob that runs the flash options:

┌────[root@xps-m1710]───────[09:26:23]───────[...hare/backgrounds/waves]──
└──> cat /etc/cron.hourly/dellLEDCtl.cron
/usr/local/bin/dellLEDCtl -s

It's a nice way to keep track of how much time you are wasting because I often get sucked into things and 4 hours pass without me knowing, now, every hour I get woken up from my stupor.