Scheduled maintenance

Dear Grunch users,

We plan to have maintenance and downtime of grunch.hpc.uib.no for about an hour this Monday. Even as they share some of the same filesystems, only grunch will be restarted, and cyclone will not be affected.

Edit 2022-04-11 15:20 : We cancelled the maintenance and it won't happen anymore this or next week. When it will, we'll put up a new notice. Sorry for the false alarm.


Best regards,

The IT department and the Scientific Computing Group

Dear users of cyclone and Ubuntu,

A large part of the data under /Data/ is being migrated to the central IT department storage place. To finish the migration, this Tursday between 9:00 and 15:00 paths below /Data/ won't be accessible all the time (or at least not writable).

Hereafter, on both the Ubuntu clients and on cyclone, all your files are still available under the regular /Data/gfi and /Data/skd paths you're used to. On Windows the new locations will be as follows:

\\klient.uib.no\FELLES\MATNAT\GFI
\\klient.uib.no\FELLES\MATNAT\SKD

For your convenience, you can map each to a drive letter by following this HOWTO.

On the Wiki there is more detailed information on GFI & SKD Storage.

Feel free to contact us via hjelp.uib.no if you have any questions.

Update 28.05 : The data migration is finished and all Linux and Window paths with the new destinations should be accessible.

Best regards,

The IT department and the Scientific Computing Group

 
Dear Cyclone and Grunch Users,

Our machine room at Thormøhlensgate 55 will have power outage during Friday 17 April 15:00 to Sunday 19. April 24:00.

There will be planned work to do maintenance for power line in the building.
Because of that maintenance, we have to take down all our servers at 15:00 Friday 17. April and hopefully we will take all servers online before 08:00 Monday 20. April.

cyclone and grunch server will be taken down and will not be accessible during this time.

/shared/ filesystem which is exported via leo.hpc.uib.no to the university campus will not be accessible during this time.

We advice all users to plan their work in good time to avoid unnecessary problems.

Update20:30_19.04.2020: Our downtime is over, cyclone and grunch is back online. 
Please contact us via hjelp.uib.no if you have any further questions.

Best Regards

Scientific computing team.


Update 12_11 21:30:

Migration is over, we manage to take up Lustre filesystem with new MDS server. /shared and /work filesystem is mounted on cyclone.hpc.uib.no and grunch.hpc.uib.no. Hexagon is up and running again. Samba and NFS exports are also running on Leo.hpc.uib.no.

Update 12_11 15:00 :

Migration is still ongoing, we will keep you posted.

Update 02_11 09:30 :

Due to the delayed delivery of physical parts, we have to postpone our downtime to 12th November. Corresponding node reservation on the hexagon is also postponed to 12th November.

Thank you for your consideration!

Dear HPC User,

The metadata server for the /shared file system has to be replaced/upgraded and therefore it must be unmounted from all the clients.

This will result in scheduled downtime for Hexagon, Grunch and Cyclone machines. We start at 08:00 AM on the 5th of November and expect to be ready by the end of the working day.

Thank you for your consideration!

Hexagon will have planned maintenance on 15th August from 08:00.

Currently /work filesystem is running on reduced performance due to broken storage controller.

During the maintenance, we will replace the broken storage controller for the storage system where /work filesystem resides. Due to the high risk of data loss, we urge all /work filesystem users to backup their important, not reproducible data.
Please keep it in mind that work is not in backedup and work is scratch filesystem.


After the maintenance we expect /work filesystem will be back on full performance.

We appreciate your understanding.

Update 15.08.2018 11:00 

Hexagon maintenance is over, we have successfully replaced the broken, controller. Work file-system is back to it's expected performance.

Update 23.05.2018 15:19 File system issues were solved and mounted back to both Hexagon and Grunch. Access to is reopened.


There is a scheduled downtime for Hexagon and /shared file system for Wednesday, 23rd of May. Scheduled downtime will start at 09:00 and we expect to have the systems back by 16:00, same day.


Our apologies for any inconvenience this downtime can give you.

Hexagon have accumulated a number of the hardware failures, which have to be fixed to ensure stable operations. Hexagon will be fully stopped and login nodes will not be accessible. We expect to finish in 4 hours.

We have also discovered a bug in our SLURM statistics, that will lead to that we will have to delete all jobs from the queue system during this downtime, including PENDING.

Our apologies for any inconvenience this downtime can give you.

Date: March 26
Timeslot: 9:00-13:00

Update:

  • 26.03.18 15:20 The machine is still down due to hardware issues. We are working on it. We will keep you updated.
  • 27.03.18 14:00 Hardware problems are fixed and access to the machine is reopened now.

We will shutdown Hexagon for maintenance on January 3rd at 09:00 to continue on reconfiguration tasks. We are expecting to have Hexagon up again same day at around 16:00.

Update 2018-01-03 19:23:
  • Access to Hexagon is re-opened.
  • /work file system had to be reformatted. Please accept our apologies for any inconvenience it might have caused.
  • /home storage area is increased and default quota is doubled from 10GB to 20GB for each user.