Scheduled maintenance

maintenance of grunch

mhu027 • April 8, 2022

Dear Grunch users,

We plan to have maintenance and downtime of grunch.hpc.uib.no for about an hour this Monday. Even as they share some of the same filesystems, only grunch will be restarted, and cyclone will not be affected.

Edit 2022-04-11 15:20 : We cancelled the maintenance and it won't happen anymore this or next week. When it will, we'll put up a new notice. Sorry for the false alarm.

Best regards,

The IT department and the Scientific Computing Group

migration of Data on cyclone and clients 27 May

mhu027 • May 25, 2021

Dear users of cyclone and Ubuntu,

A large part of the data under /Data/ is being migrated to the central IT department storage place. To finish the migration, this Tursday between 9:00 and 15:00 paths below /Data/ won't be accessible all the time (or at least not writable).

Hereafter, on both the Ubuntu clients and on cyclone, all your files are still available under the regular /Data/gfi and /Data/skd paths you're used to. On Windows the new locations will be as follows:

\\klient.uib.no\FELLES\MATNAT\GFI
\\klient.uib.no\FELLES\MATNAT\SKD

For your convenience, you can map each to a drive letter by following this HOWTO.

On the Wiki there is more detailed information on GFI & SKD Storage.

Feel free to contact us via hjelp.uib.no if you have any questions.

Update 28.05 : The data migration is finished and all Linux and Window paths with the new destinations should be accessible.

Best regards,

The IT department and the Scientific Computing Group

Downtime: Machine room power outage at Thormøhlensgate 55 April 17~19.

saerda • April 2, 2020

Dear Cyclone and Grunch Users,

Our machine room at Thormøhlensgate 55 will have power outage during Friday 17 April 15:00 to Sunday 19. April 24:00.

There will be planned work to do maintenance for power line in the building.
Because of that maintenance, we have to take down all our servers at 15:00 Friday 17. April and hopefully we will take all servers online before 08:00 Monday 20. April.

cyclone and grunch server will be taken down and will not be accessible during this time.

/shared/ filesystem which is exported via leo.hpc.uib.no to the university campus will not be accessible during this time.

We advice all users to plan their work in good time to avoid unnecessary problems.

Update20:30_19.04.2020: Our downtime is over, cyclone and grunch is back online.

Please contact us via hjelp.uib.no if you have any further questions.

Best Regards

Scientific computing team.

Cyclone maintenance 25th September

saerda • September 24, 2019

Cyclone will be taken down tomorrow, 25th September, from 12:00 until 14:00 for regular maintenance. We will perform OS related updates and some core libraries are going to be updated too. more information will be posted.
Update 25.09.2019 14:55 cyclone.hpc.uib.no is back online.

Cyclone reboot planned for 09:00 19-03-2019

lsz075 • March 19, 2019

Update 10:40: Access to cyclone is reopened now. Delay was caused by missing kernel modules and old Lustre packages.

Cyclone will be sharply rebooted at 09:00 to apply new filesystem settings.

Scheduled maintenance for /shared file system on 5th of November

lsz075 • October 22, 2018

Update 12_11 21:30:

Migration is over, we manage to take up Lustre filesystem with new MDS server. /shared and /work filesystem is mounted on cyclone.hpc.uib.no and grunch.hpc.uib.no. Hexagon is up and running again. Samba and NFS exports are also running on Leo.hpc.uib.no.

Update 12_11 15:00 :

Migration is still ongoing, we will keep you posted.

Update 02_11 09:30 :

Due to the delayed delivery of physical parts, we have to postpone our downtime to 12th November. Corresponding node reservation on the hexagon is also postponed to 12th November.

Thank you for your consideration!

Dear HPC User,

The metadata server for the /shared file system has to be replaced/upgraded and therefore it must be unmounted from all the clients.

This will result in scheduled downtime for Hexagon, Grunch and Cyclone machines. We start at 08:00 AM on the 5th of November and expect to be ready by the end of the working day.

Thank you for your consideration!

Hexagon downtime for /work filesystem maintenance

saerda • August 2, 2018

Hexagon will have planned maintenance on 15th August from 08:00.

Currently /work filesystem is running on reduced performance due to broken storage controller.

During the maintenance, we will replace the broken storage controller for the storage system where /work filesystem resides. Due to the high risk of data loss, we urge all /work filesystem users to backup their important, not reproducible data.
Please keep it in mind that work is not in backedup and work is scratch filesystem.

After the maintenance we expect /work filesystem will be back on full performance.

We appreciate your understanding.

Update 15.08.2018 11:00

Hexagon maintenance is over, we have successfully replaced the broken, controller. Work file-system is back to it's expected performance.

Scheduled downtime on Hexagon and /shared on 23.05.2018

lsz075 • May 16, 2018

Update 23.05.2018 15:19 File system issues were solved and mounted back to both Hexagon and Grunch. Access to is reopened.

There is a scheduled downtime for Hexagon and /shared file system for Wednesday, 23rd of May. Scheduled downtime will start at 09:00 and we expect to have the systems back by 16:00, same day.

Our apologies for any inconvenience this downtime can give you.

Hexagon: urgent maintenance on March 26th, 9:00-13:00

Alexander Oltu • March 14, 2018

Hexagon have accumulated a number of the hardware failures, which have to be fixed to ensure stable operations. Hexagon will be fully stopped and login nodes will not be accessible. We expect to finish in 4 hours.

We have also discovered a bug in our SLURM statistics, that will lead to that we will have to delete all jobs from the queue system during this downtime, including PENDING.

Our apologies for any inconvenience this downtime can give you.

Date: March 26
Timeslot: 9:00-13:00

Update:

26.03.18 15:20 The machine is still down due to hardware issues. We are working on it. We will keep you updated.
27.03.18 14:00 Hardware problems are fixed and access to the machine is reopened now.

Hexagon scheduled maintenance on January 3rd

Alexander Oltu • December 18, 2017

We will shutdown Hexagon for maintenance on January 3rd at 09:00 to continue on reconfiguration tasks. We are expecting to have Hexagon up again same day at around 16:00.

Update 2018-01-03 19:23:

Access to Hexagon is re-opened.
/work file system had to be reformatted. Please accept our apologies for any inconvenience it might have caused.
/home storage area is increased and default quota is doubled from 10GB to 20GB for each user.

HPC Syslog

Log over changes and events on UiB's HPC systems