#ovirt log

13:58:51 <knesenko> #startmeeting oVirt Infra
13:58:51 <ovirtbot> Meeting started Mon Sep 30 13:58:51 2013 UTC.  The chair is knesenko. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:58:51 <ovirtbot> Useful Commands: #action #agreed #help #info #idea #link #topic.
13:59:02 <knesenko> #chair ewoud knesenko dcaro
13:59:02 <ovirtbot> Current chairs: dcaro ewoud knesenko
13:59:19 <knesenko> eedri is not here today ....
13:59:28 <knesenko> obasan is not here
13:59:35 <ewoud> ok
14:00:02 <knesenko> anyone else here ? new folks ?
14:00:09 * orc_orc is here
14:00:14 <orc_orc> ... Russ Herrold
14:00:14 <ovirtbot> orc_orc: Error: ".." is not a valid command.
14:00:16 <orc_orc> ... Russ Herrold
14:01:31 <knesenko> dcaro: can you lead the ,meeting ?
14:01:35 <knesenko> dcaro: I need to go
14:01:40 <knesenko> dcaro: explain later
14:01:59 <dcaro> knesenko: ok
14:04:04 <dcaro> Hi Russ, you want to join ovirt infra then?
14:04:27 <orc_orc> more, I wish to watch a bit and learn more on the project first
14:04:52 <orc_orc> I have tried local setups and such and the CI model is something I wish to replicate on my development bench
14:05:34 <orc_orc> I have a wiki account (so I may correct bitrot) and have been trying to work through it
14:05:52 <dcaro> orc_orc: ok, you are welcome to ask anytime :)
14:05:52 <dneary> hi all
14:05:52 <ewoud> orc_orc: I assume you also follow the mailing list then?
14:06:00 <orc_orc> dcaro: absolutely
14:06:22 <dcaro> #chair dneary
14:06:22 <ovirtbot> Current chairs: dcaro dneary ewoud knesenko
14:07:18 <ewoud> I don't think we have an agenda
14:07:46 <ewoud> let's first talk about the artifactory server
14:08:04 <dcaro> #topic Artifactory server
14:08:26 <dcaro> eyal told me that it's almost ready, what's missing?
14:08:51 <ewoud> #info network issues resolved, were duplicate mac addresses; pool on alterway02.ovirt.org has been adjusted not to conflict
14:09:05 <ewoud> I don't know how it was installed, but it wasn't using puppet
14:09:26 <dcaro> I haven't seen no patches there neither
14:11:04 <ewoud> I think what was missing was mostly that jenkins wasn't using it yet
14:11:32 <dcaro> :), yep, we have to do some changes in some of the jobs to get the maven settings
14:11:58 <ewoud> I think we need a puppet patch to fill ~jenkins/.m2/settings.xml on every slave
14:13:04 <ewoud> looking at the bash history, the install was rather trivial so I'll try to write a very simple patch
14:13:13 <ewoud> #action ewoud puppetize artifactory
14:13:30 <dcaro> ewoud: it's a little more complicated, as some of the jobs run maven inside makefiles and scripts, that require adding some parameters to use othe maven configs
14:14:47 <ewoud> dcaro: doesn't maven read the settings stacked? so first system, then user and then project?
14:16:01 <dcaro> ewoud: I was told that you have to pass some extra flags to make based projects, maybe they overwrite some of those options inside the scripts or something
14:16:24 <ewoud> hmm ok
14:16:50 <dcaro> ewoud: but we can try :)
14:18:57 <dcaro> ok, any volunteers?
14:19:27 <ewoud> I think someone on the ML already supplied the config
14:21:07 <ewoud> http://lists.ovirt.org/pipermail/infra/2013-September/004023.html
14:21:07 <dcaro> yep, alon passed the new settings.xml file
14:21:18 <ewoud> should we just deploy that using puppet?
14:21:35 <dcaro> agree
14:22:11 <ewoud> #action ewoud deploy settings.xml to all jenkins slaves
14:22:49 <dcaro> ewoud: we will have to make sure the jobs are actually using it
14:23:13 <ewoud> dcaro: could you have a look at that once we've deployed it?
14:23:41 <dcaro> #action dcaro make sure jobs use the new settings.xml file
14:24:39 <dcaro> ok, next issue
14:25:16 <dcaro> #topic new machine ovirt03.redhat.com/rackspace03.ovirt.org
14:25:56 <dcaro> I've opened the ticket to get it installed, I hope this week it will be available
14:26:15 <ewoud> ok, then we need to think of a migration plan
14:26:26 <ewoud> one option is just to install new slaves
14:26:49 <ewoud> I was thinking that it might be useful to have the OS in the hostname
14:27:18 <ewoud> such as slave-f18-01, slave-f19-01 etc
14:28:35 <orc_orc> ewoud: is there a convention between using a unique hode name in the A record, and then using a CNAME for such mutable information?
14:28:59 <dcaro> orc_orc: ewoud yep, I was thinking on something like that
14:29:11 <orc_orc> s//host/
14:29:16 <dcaro> as we use dns entries to connect to the slaves
14:30:09 <ewoud> orc_orc: since the OS is part of the role of those slaves, I think it's part of the hostname
14:30:10 <dcaro> but we do not control the dns directly (each change requires a request to a third party)
14:30:28 <ewoud> dcaro: yes, eventually we may need to find a solution for that
14:30:34 <dcaro> so it's not easy to change right now
14:30:43 <dcaro> s/easy/agile/
14:31:43 <orc_orc> so, a locally managable solution would be to ask for a permanent and unique A record when a host is deployed, and then assign and manage a CNAME in a domain under local control?
14:31:57 <orc_orc> that solves the reliance on third party after the initial setup
14:32:14 <orc_orc> (I do something like for the LSB_
14:32:50 <ewoud> orc_orc: ovirt.org is managed by RedHat IT, so requesting an A record and a CNAME is the same procedure
14:33:29 <dcaro> I think he proposes to host our own dns servers for a subdomain so we can change the config ourselves
14:33:31 <ewoud> we could run bind on foreman and replicate it to other servers for reliability, change the NS records in the .org zone and manage it ourselves
14:34:13 <orc_orc> dcaro: yes -- either a sub-delegation (harder) or a local utility domain (easier)
14:35:11 <ewoud> at $dayjob I've done the sub-delegation and it's not that hard if you have access to the correct accounts
14:35:17 <ewoud> not sure how easy that is to do with RH IT
14:35:38 <dcaro> we can try
14:35:55 * eedri here
14:36:08 <dcaro> #chair eedri
14:36:08 <ovirtbot> Current chairs: dcaro dneary eedri ewoud knesenko
14:36:39 <ewoud> but what's the plan with the new host?
14:36:51 <ewoud> install fresh slaves or try to migrate the existing ones?
14:36:59 <dcaro> hehehe, we diverged a bit yes :)
14:37:08 <knesenko> I am sorry i am back
14:37:29 <dcaro> knesenko: eedri welcome back
14:37:51 <dcaro> we are discussing what's the plan with the new rackspace03 server
14:38:14 <knesenko> ah ok
14:38:27 <knesenko> seems like we need to reinstall rackspace
14:38:30 <knesenko> ovirt0-engine
14:38:38 <knesenko> I don;t like local sotrage setups
14:38:53 <ewoud> knesenko: that was the plan with the new server
14:39:07 <ewoud> install that in a cluster setup with just 1 host initially
14:39:27 <ewoud> attach the gluster storage to it
14:39:42 <ewoud> then we can either migrate existing hosts or install fresh ones
14:39:47 <ewoud> I'm leaning toward the latter
14:40:03 <knesenko> yes ok
14:40:49 <dcaro> so how will we migrate the old hosts to gluster? Just create the vms from scratch again?
14:41:13 <ewoud> I'd say yes
14:41:29 <knesenko> dcaro: I think that we can create export domain .... export them and import to another DC
14:41:46 <ewoud> so we start by emptying one (say rackspace01)
14:41:55 <ewoud> and exporting + importing could work
14:42:00 <knesenko> right
14:42:17 <ewoud> then once it's empty, we reconfigure rackspace01 to join the cluster with rackspace03
14:42:25 <ewoud> and when that works, we do the same for rackspace02
14:42:30 <dcaro> deleting the vms means creating/reconfiguring the slaves on jenkins
14:43:20 <ewoud> export + import could be quick, depending on the storage size
14:46:03 <dcaro> I prefer export + import
14:46:13 <knesenko> dcaro: me too
14:46:31 * ewoud has no preference
14:46:54 <dcaro> can we document the proceeding somewhere?
14:47:27 <ewoud> wiki? etherpad?
14:47:29 <knesenko> dcaro: I have a ticket somewhere to plan that migration
14:47:40 <dcaro> perfect
14:48:00 <dcaro> next then?
14:48:41 <knesenko> yes
14:49:25 <dcaro> ewoud: did you check if we ca kickstart using dhcp on rackspace?
14:49:39 <ewoud> dcaro: haven't had time
14:49:44 <dcaro> ok
14:50:04 <dcaro> #topic jenkins update to latest lts
14:50:18 <dcaro> ewoud: what's the status?
14:50:28 <ewoud> dcaro: oh, I forgot to mail an update
14:50:44 <ewoud> but I updated it using yum update, service jenkins restart
14:51:44 <dcaro> ewoud: I remember I had some problems with jenkins sending the gerrit message twice, I had to restart it, but I think it was on wednesday though
14:51:54 <dcaro> anyhow, any issues?
14:52:00 <ewoud> not that I know
14:52:06 <dcaro> good :)
14:52:10 * ewoud checked if all slaves reconnected
14:52:24 <dcaro> we are having dead slave lately
14:53:12 <ewoud> one slave or different ones?
14:53:31 <dcaro> iirc more than one
14:54:48 <dcaro> I think they get out of memory and kill the java process
14:55:24 <ewoud> then dmesg should show it
14:55:57 <dcaro> ewoud: it does, but I'm not sure of the date :S
14:56:41 <ewoud> dcaro: you can clean it using dmesg -c it it should be at least somewhat recent
14:57:10 <dcaro> ewoud: I'll do
14:57:43 <orc_orc> ewoud  a more durable solution might be:  ( date ; dmesg -c ) >> /var/log/messages
14:57:57 <orc_orc> that gets a searchible record with timestamps
14:58:19 <ewoud> dcaro: what orc_orc said ;)
14:59:32 <orc_orc> ewoud: I recall I wrote that for a deployment outline when I was having troubles with a given project.  Also I made sure systat (sar) was present and running
15:02:31 <ewoud> any other items on the agenda?
15:02:46 <orc_orc> I had an 'add on item' I noticed
15:03:14 <ewoud> 'add on item'?
15:03:26 <orc_orc> an iten not 'on the agenda'
15:03:29 <orc_orc> reading the mailing list, I had reviewed the Logwatch report for linode01.ovirt.org
15:03:34 <orc_orc> see, eg: http://lists.ovirt.org/pipermail/infra/2013-September/004030.html  at: sendmail-largeboxes
15:03:39 <orc_orc> and it appears that the mail spool for userid: jenkins is not being purged of older messages
15:04:05 <orc_orc> I assume the emails to that account ID may be needed to trigger a VCS check and a build when a commit message is seen.
15:04:12 <orc_orc> ... but then perhaps those email should be deleted after a week or so, to prevent the file from growing without limit
15:04:12 <ovirtbot> orc_orc: Error: ".." is not a valid command.
15:04:16 <orc_orc> ... but then perhaps those email should be deleted after a week or so, to prevent the file from growing without limit
15:04:31 * ewoud looks what's even in there
15:04:34 <orc_orc> it may be that the emails are not wanted at all and should be devnulled at once
15:04:40 <orc_orc> * nod *
15:05:27 <dcaro> I fear that most of them might be cron jobs or something
15:05:36 <orc_orc> (buildbot, which which I am familiar) may have builds triggered by 'seeing' commit emails)
15:06:54 <ewoud> it seems it tries to mail to gerrit2@gerrit.ovirt.org but, can't connect
15:07:22 <orc_orc> ... so probably those: email delivery delayed warnings
15:07:22 <ovirtbot> orc_orc: Error: ".." is not a valid command.
15:07:24 <orc_orc> ... so probably those: email delivery delayed warnings
15:07:39 <ZummiG777> Question: I'm interested in pre-populating SSL keys on the ovirt engine location and on the vdsm side of things (so that as much as possible can be automated).  Does anybody know what files need to be copied to the VDSM side to allow this, specifically for the authorized_keys file entries
15:07:48 <ewoud> orc_orc: thanks for noticing
15:07:55 <orc_orc> ewoud: no problem ;)
15:08:51 <ewoud> dcaro: any way we can tell jenkins not to mail gerrit?
15:09:16 <dcaro> why is he trying to email gerrit on the first place?
15:09:46 <ewoud> dcaro: I think because a build failed and then it will try to mail the one who broke it?
15:10:02 <ewoud> gerrit will place itself as committer if you cherry pick
15:10:09 <dcaro> I think that the problem is that jenkins creates the patch review as user 'jenkins@ovirt.org', and then getting all the emails for further updates of th epatches...
15:11:01 <ewoud> Project: http://jenkins.ovirt.org/job/ovirt_engine_update_db_multiple_os/./label=centos64/
15:11:05 <ewoud> that job at least mails
15:11:23 <ewoud> Project: http://jenkins.ovirt.org/job/ovirt-engine_master_create_rpms_quick/./label=fedora18/
15:11:26 <ewoud> on failure as well
15:12:21 <dcaro> ok, we can troubleshoot it on the ml
15:12:23 <orc_orc> can jenkins be instructed to substitute the last committer instead of itself?
15:14:01 <ewoud> orc_orc: I think it's configured to mail those who changed something since the last build, which is normally a good thing
15:14:04 <eedri> ewoud, dcaro every time someone uses rebase button on gerrit the commiter changes to gerrit2@ovirt.org
15:14:18 <ewoud> eedri: I think gerrit 2.7 changes that
15:14:21 <eedri> ewoud, dcaro that's a "feature" and gerrit developers are not willing to change that
15:14:22 <ewoud> maybe 2.8
15:14:30 <orc_orc> ewoud: noted -- just hoping that in some cases an over-ride might be used
15:14:46 <eedri> ewoud, dcaro this is why it's important to check the checkbox in scm advanced option for tracking commit author and not commiter
15:14:51 <orc_orc> alternatively editting the MTA's aliasing table to send email to a monitored address may make sense
15:15:00 <ewoud> eedri: or I'm confused with another patch
15:15:09 <eedri> ewoud, i really hope so
15:15:21 <ewoud> eedri: I was not aware of the commit author vs committer
15:15:21 <eedri> ewoud, right now we have 2.6.1
15:15:36 <eedri> ewoud, if you check advaned git option on jenkins job, you'll see the option
15:16:28 <eedri> ewoud, did you see artifactory is online
15:16:54 <ewoud> eedri: yes, but it would be nice to have it puppetized
15:17:32 <dcaro> I'll add an action to check the email issue
15:18:05 <dcaro> #action dcaro investigate jenkins@linode01 emails origin
15:18:43 <ewoud> eedri: http://gerrit.ovirt.org/19698 should puppetize the install
15:19:04 <ewoud> it doesn't deal with fetching the RPM and I didn't see a yum repo for their RPMs either
15:19:39 <ewoud> eedri: is openjdk-1.7.0-devel needed to run artifactory as well?
15:20:01 <ewoud> http://dl.bintray.com/content/jfrog/artifactory-rpms looks like a yum repo :)
15:21:21 <eedri> ewoud, yes
15:21:23 <eedri> ewoud, must
15:21:33 <eedri> ewoud, as they write in installation notes
15:22:10 <ewoud> eedri: ok
15:22:15 <eedri> ewoud, +1, that's about what i did
15:22:25 <eedri> ewoud, we should consider adding a running apache in front
15:22:33 <eedri> ewoud, to use port 80, but not critical
15:24:06 <ewoud> eedri: I did look in history to determine that :)
15:24:56 <dcaro> any other issues?
15:26:04 <ewoud> I think we can close the meeting
15:29:00 <dcaro> #endmeeting