13:55:21 #startmeeting oVirt Infra meeting 13:55:21 Meeting started Mon Jul 1 13:55:21 2013 UTC. The chair is knesenko. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:55:21 Useful Commands: #action #agreed #help #info #idea #link #topic. 13:55:32 #chair obasan 13:55:32 Current chairs: knesenko obasan 13:55:39 #chair eedri dcaro 13:55:39 Current chairs: dcaro eedri knesenko obasan 13:55:56 * eedri here 13:56:12 ewoud, can you attend it? 13:57:08 #topic Hosting 13:57:39 * obasan here 13:57:39 #chair obasan 13:57:39 Current chairs: dcaro eedri knesenko obasan 13:57:50 ok lets start guys 13:58:09 in a few minutes 13:58:43 dneary, around? 13:58:50 dneary, if you want to attend the meeting 13:59:02 eedri, Yup, here 13:59:16 #chair dneary 13:59:16 Current chairs: dcaro dneary eedri knesenko obasan 13:59:31 eedri: want to lead the meeting ? 13:59:54 knesenko, no, go ahead 13:59:57 eedri: ok 14:00:06 ok lets start then 14:00:24 I have some updates regarding the hosting 14:00:35 dcaro: found some issue on rackspace01 ... 14:00:56 seems like each reboot , the default gateway was deleted somehow ... 14:01:04 I"ll need to retest it ... 14:01:50 knesenko, did you tried disabling the firewall? 14:01:51 I will test the iptables rules and the default gateway issues + reboot . if everything will work ok - I will install ovirt_engine + gluster service on them 14:02:18 knesenko, if rackspace has external firewall, maybe the local ip tables is not needed 14:02:25 eedri: I can ... but I am not sure that we are allowed to do it 14:02:41 dneary, ? 14:02:46 dneary, what do you think? 14:02:52 eedri: no need neither, the iptables/firewalld are ok, the problem is just the gateway 14:02:54 eedri, Following along 14:03:00 Afraid I have no advice to offer 14:03:08 that's why it works from the internal hosts (same network) 14:03:15 eedri: it will be great if we just can disable iptables rules and work without them 14:03:25 knesenko, eedri: Is NM managing the networking there? 14:03:33 knesenko, i think we can do whatever we want, it's our servers to manage, just need to be sure the external firewall is working 14:03:50 eedri: external firewall is working 14:03:52 eedri: not actively, highlight if you need me 14:04:11 dneary, NM? 14:04:29 Network Manager 14:04:42 knesenko, ? 14:04:54 eedri: let me check 14:05:11 dneary: eedri yes its up 14:06:21 knesenko: we have this in the net config for em1: NM_CONTROLLED=no 14:07:08 dcaro: right ... so actually we don't use it 14:08:54 anything else on hosting ? 14:08:56 knesenko, let's send a ticket to rackspace querying about disabling the iptables service 14:09:20 eedri: we can update the existing one 14:09:35 new alterway contact - Hervé Leclerc (CCed) will be taking over from Kévin Mazière 14:09:46 dcaro: can you ask rackspace if we can disable the iptables rules on the hosts ? 14:09:59 knesenko: sure 14:10:12 #action dcaro ask rackspace if we can disable the iptables rules on the hosts 14:10:26 dcaro: ok good . thanks 14:10:40 #info new alterway contact herve.leclerc@alterway.fr ( Hervé Leclerc ) 14:10:46 dcaro: what about the graphite ? any progress there ? 14:11:48 dcaro, OK thanks - in that case the default gateway is set in /etc/sysconfig/network-scripts/ifconfig-em1 14:11:59 Or something similar to that (that path was from memory) 14:12:18 dneary, i think usualy it's ifcfg-em1 14:12:42 knesenko: nop, lack of time, will try to address it this week 14:12:46 eedri: we have ovirt-engine on that host ... so in that case its ifcfg-ovirtmgmt 14:12:52 dcaro: ok thanks 14:12:57 knesenko, ok 14:13:03 dneary: almost, /etc/sysconfig/network-scripts/ifcfg-em1 14:13:07 :) 14:13:15 obasan: what about openshift quota ? 14:13:48 knesenko, that won't be a problem. I found the command that does it 14:13:58 knesenko, if we want to check quota we can use an ssh command and get the result 14:14:09 obasan, can you add it to our monitoring server then? 14:14:17 eedri, yes 14:14:20 obasan: good 14:14:47 #action obasan add monitoring openshift quota to our monitoring server 14:15:07 does anybody knows how can i remove items from actions ? 14:15:17 undo? 14:15:23 --#undo 14:15:51 http://wiki.debian.org/MeetBot 14:16:14 eedri: I have this action from the previous meeting - ACTION: obasan look into monitoring openshift quota 14:16:24 eedri: and I want to remove because its done ... and finished 14:16:48 knesenko, i think the meeting summary won't show it unless you add it again 14:16:56 eedri: ok good. 14:17:02 so anything else on hosting ? 14:17:28 knesenko, let's move on to jenkins? 14:17:33 ok 14:17:39 #topic jenkins 14:17:49 who wants to start ? 14:17:59 knesenko, any update on the error on backup job? 14:18:18 * dneary has some updates on the website from last week - ping me when ready? 14:18:30 eedri: backup was fixed ... 14:18:50 eedri: it will run the full backup on a weekly basis ... 14:19:00 #info jenkins config backup job was fixed, changed to backup full back weekly 14:19:07 knesenko, thanks 14:19:45 last ovirt meeting we agreed to upgrade the only f17 slave to f19 14:19:48 so it should reduce the space on alterway02 14:19:57 ewoud, any progress with that? 14:20:05 knesenko, yes 14:20:54 also we are lack of resources .... so we need to push rackspace* installation asap 14:21:05 me and obasan will do it today 14:21:10 knesenko, +1 14:21:38 quaid, around? 14:21:41 +1, last week a couple slaves killed the slave thread because they were out of resources :S 14:22:50 i know I know .... I need to push it ....I am sorry .... but I was blocked on some issues there .... 14:22:57 I hope I will finish it this week .... 14:23:33 ewoud, ? 14:23:57 eedri: seens like he is not here 14:24:18 knesenko, ok, let's just add it as action 14:24:29 #action need to upgrade f17 jenkins slave to f19 14:24:42 we'll see later who can pick this up 14:24:51 I don;t think that upgrade will work ... need to reinstall it 14:25:04 knesenko, might work f17-f18 then f18-f19 14:25:06 eedri: no, no progress on f17 => f19 14:25:17 knesenko, i upgraded the f17 to f18 and it worked 14:25:59 eedri: we can try . 14:26:11 eedri: but having a clean f19, sounds better 14:26:15 knesenko, if rackspace can be ready this week, that's best 14:26:21 xd 14:26:23 knesenko, that way we can install a f19 vm 14:26:54 eedri: I hope it will be ready today 14:26:58 right obasan ? 14:27:11 ok, we can focus then on getting rackspace ready, knesenko obasan, if you need help just tell me 14:27:16 :) 14:27:26 dcaro: ok thanks 14:27:57 dneary, ? 14:28:03 eedri! 14:28:05 dneary, you had updates on the site right? 14:28:10 Yup 14:28:18 #topic oVirt website 14:28:39 As you all noticed, we have had some outages and performance issues over the past couple of weeks 14:28:50 Last week I spent some time digging into them to understand 14:29:13 Summary (much of this has gone to the mailing list already, but there are a few topics for discussion) 14:29:50 # Thanks to garrett we have now got an upgraded Strapping theme with many fixes - we should now be in less danger of filling up the disk in error logs 14:30:04 Also plugged 3 security holes at the same time 14:30:19 and improved the responsive layout for phones and tablets 14:30:28 (and improved rendering in Internet Explorer as well) 14:31:15 # There was an OpenShift issue due to the move from V1 to V2 cartridges in OpenShift - an environment file which was expected in V2, but missing in V1, was causing the environment to be set incorrectly, so the Pear Mail module was not being found - since last week, emails are now working again 14:31:36 We can now deploy clear & properly with Mail in deplist.txt 14:32:31 # The performance issue - I was connected to our server during a severe slow-down, and noticed a lot of Bot traffic - seemed like there were 2-3 spiders trawling the entire history of the wiki indexing everything all at the same time 14:32:54 dneary, did you manage to block it via httpd/robot.txt? 14:32:57 So - this is the action/discussion part - we need to figure out how to defend ourselves against misbehaving spiders 14:33:02 eedri, No 14:33:05 Nothing done yet 14:33:15 dcaro, has some experience with it on gerrit 14:33:33 dneary, also, we can consider adding monitoring graphs to ovirt.org 14:33:40 dneary, and check load statistics 14:33:45 I have a bunch of promising links, but I need to learn more about the area 14:33:50 http://www.robotstxt.org/orig.html#code 14:34:00 http://www.thesitewizard.com/apache/block-bots-with-htaccess.shtml 14:34:15 dcaro, how did you block the bots from gerrit.ovirt.org? 14:34:28 Need some MediaWiki specific recipes 14:34:30 dneary: actually we have all the robots blockd under gerrit.ovirt.org, specially bing was loading too much the server 14:34:50 dcaro, we do want wiki pages to be indexed by Google 14:35:01 dcaro, But not the entire history - just the latest revision 14:35:38 One last thing 14:35:39 dneary: I hope they have a common format 14:36:11 dcaro, I bey you can just filter out anything that calls index.php directly 14:36:32 dcaro, They all have "revision=" GET queries for history 14:36:40 OK - last point: 14:36:59 # Our wiki bot appears not to be running any more on resources.ovirt.org 14:37:11 Rydekull set it up, but I don't know how to restart it 14:37:25 There are errors for every new page revision in error.log now 14:37:43 We should either disable the bot, or ensure it's running on resources 14:38:28 looks like a monitoring task 14:38:40 Yes 14:38:47 For all of these we need monitoring 14:39:08 And we need a full list of all of the servers we have & services they're running & how to stop & restart them and so on 14:39:38 In general, I think we need to have someone (as per quaid's email last week) to take on the mantle of infra team co-ordinator 14:39:41 also what we want to monitor and when to restart them (hugh load, error log...) 14:40:03 * dneary looks at dcaro, eedri 14:40:16 Do either of you have some time to whip everything into shape? 14:40:18 can we add a wiki page on infra on what we are currently monitoring? 14:40:38 xd, I can try to give it some time 14:40:44 dneary, obasan installed a monitoring server lately, we can look into adding more services to monitor 14:41:01 dneary, can you add it as task to the trac? 14:41:15 dneary, so we won't forget about it, which services we want to monitor,etc... 14:41:39 eedri, Just to be precise, what's "it" in your question above? 14:41:59 dneary, :), the list of servers/services we want to monitor 14:42:38 eedri, OK 14:42:48 dneary, from your perspective (i.e ovirt.org or the service that didn't work on resources.ovirt.org) 14:42:59 dneary, we can update that ticket as we go along with more services that come into mind 14:43:04 eedri, New Trac, or a comment on the monitoring ticket? 14:43:20 dneary, if there is already a ticket, then a new comment will be good :) 14:43:42 obasan, can you review the monitoring ticket later and see what can be added? 14:43:52 eedri, yes 14:44:49 #action need to add monitoring to ovirt.org load/bots 14:45:25 #action look into blocking overloading bots on wiki page 14:45:45 dneary, anything else on ovirt site? 14:46:25 did dneary mention the missing new project incubation page? 14:47:24 mburns, not that i recall. mainly on outage and upgrades 14:48:10 ok guys anything else ? 14:48:32 knesenko, we should quicky scan/review trac tickets 14:48:32 lets move to puppet then ? 14:48:37 knesenko, sure 14:48:46 eedri: lets do it in the end ? 14:48:57 knesenko, +1 14:48:58 in the last topic 14:49:04 #topic puppet 14:49:11 dcaro: any news here ? 14:49:35 knesenko: not on my side :S, still have to catch up 14:49:46 knesenko, maybe ewoud can update 14:50:01 ewoud: ? :) 14:50:07 * ewoud has been lacking time 14:50:12 ewoud, do we have a running puppet master ? 14:50:23 eedri: we do, and we manage user accounts with it for a few admins 14:50:24 ewoud, so new rackspace vms can use.. 14:50:25 * quaid lurking now if there are any open questions for him 14:50:36 eedri: there's even a git repo on gerrit 14:50:40 ewoud, ok, so what is missing is just new puppet classes... 14:50:56 dcaro: we will need your help to add rackspace* servers to the puppet server 14:50:57 eedri: exactly 14:51:07 knesenko, ewoud let's try to push new classes after the rackspace install 14:51:26 knesenko, we'll see what's missing and add it (ntp/nfs/services/etc...) 14:51:27 I'd like to install nrpe on each machine using puppet and generate the icinga config 14:51:44 i can help with it if needed 14:51:49 obasan: ^^ :) 14:52:08 obasan, can you send nrpe puppet class for review? 14:52:29 eedri, I already sent one. and then ewoud introduced an improved one 14:52:37 obasan, ok, so it's merged? 14:52:45 eedri: no, we should add it to our infra 14:52:55 eedri: and it doesn't use exported resources yet 14:53:10 I'm a bit unfamiliar with how nagios/icinga configs are writen 14:53:16 ewoud, oh... so after we merge it, there is still a manual process to load it to foreman? 14:53:23 eedri: yes 14:53:36 ewoud, can't we just add a cron job to do it? 14:53:47 dcaro, ? 14:53:57 eedri: we could replace that by jenkins 14:54:09 ewoud, +1 14:54:10 eedri: ewoud: yep 14:54:18 ewoud, jenkins job is already better than a local/hidden cron 14:54:18 in essence it's just on gerrit merge a git push 14:54:51 dcaro, can you add it to jenkins? 14:54:52 we just need to add the jenkins public key to puppet@foreman and define the job 14:55:09 eedri: ok, I'll crate a ticket for me 14:55:15 dcaro, thanks 14:55:18 dcaro: good 14:55:28 did you guys talk about quaids mail? 14:55:29 #action dcaro to add jenkins job to load new puppet classes to foreman 14:55:42 ewoud, no, you mean the ovirt.org cert? 14:56:09 I think he ewoud talks about the project coordinator 14:56:10 eedri: no, the project leader 14:56:12 that 14:56:21 I have yet to respond to it, but I fully agree with it 14:56:34 ewoud, i didn't respond yet as well 14:56:45 I'd also like to note that I'm not the ideal person for it 14:57:00 eedri: we need to raise it up and discuss it with our manager 14:57:43 knesenko, ewoud let's continue the discussion on the ml and see who can take lead on it 14:57:50 eedri: let's 14:57:50 eedri: ok 14:57:59 anything else on puppet ?> 14:58:27 #topic other business + review tickets 14:58:41 ok lets review upstream tickets 14:58:50 https://fedorahosted.org/ovirt/report/1 15:00:05 lets start with the jenkins component tickets 15:00:17 https://fedorahosted.org/ovirt/ticket/59 15:00:37 oh that mine :) I will handle it 15:00:46 https://fedorahosted.org/ovirt/ticket/43 15:01:06 this one ^ is related to F19 upgrade 15:01:19 do we have a volunteer here ? 15:01:25 knesenko, yes, so if we finish rackspace this week, 1st vm should be f19 15:01:26 and to the rackspace servers 15:01:36 knesenko, otherwise we'll upgrade the existing f17 15:01:50 eedri: ok ... so I will take this ticket as well 15:02:12 next 15:02:13 https://fedorahosted.org/ovirt/ticket/51 15:02:25 also pending new vms on rackspace 15:02:29 eedri: seems like you are working on it right ? 15:02:34 and possible baremetal server 15:02:40 to run vdsm tests 15:02:40 eedri: ok . can I assign this ticket to you ? 15:02:47 sure 15:03:23 https://fedorahosted.org/ovirt/ticket/6 15:03:56 i think foreman is up 15:03:58 I think that we did most here ... puppet and foreman are up and running 15:04:00 and puppet as well 15:04:06 need to add new classes to the puppet ... 15:04:11 yep, basic installation is running 15:04:11 so we can close it then ? 15:04:20 knesenko, i'd close it and open a new one 15:04:26 for puppet classes needed 15:04:27 eedri: ok 15:04:59 https://fedorahosted.org/ovirt/ticket/17 15:05:29 ok we have a backup for jenkins ....what about other services ? 15:05:33 dneary, ? 15:05:43 dneary, openshift/media wiki is backedup? 15:05:47 eedri, I do not know 15:05:56 :\ 15:06:04 eedri, quaid may know 15:06:12 quaid, ? 15:06:14 ok ... I think that we should open a thread on it on ML ... 15:06:30 knesenko, agree, can you send email on it? 15:06:31 we need a defined server/storage to store there our backups 15:06:36 eedri: yes 15:07:02 #action knesenko send email regarding the backups . related to https://fedorahosted.org/ovirt/ticket/17 15:07:12 https://fedorahosted.org/ovirt/ticket/28 15:07:31 knesenko, I agree 15:07:49 quaid: any update on https://fedorahosted.org/ovirt/ticket/28 ? 15:08:18 dneary: ^^ maybe you know ? 15:08:24 * quaid back 15:08:58 hmm, I'm drawing a blank on mediawiki backup 15:09:29 knesenko: about SSL, I just asked dcaro & eedri if they can finish that one for me, the IT group is ready to get the CSR and have it signed, then we drop it in place everywhere, etc. 15:09:32 quaid, knesenko: I've gone through the SSL process for openstack.redhat.com 15:09:37 It's not complicated 15:09:42 we worked out the details, but I ran out of time to finish 15:09:53 not sure if either can take it over, haven't seen their reply 15:10:11 ok thanks for update ... 15:10:17 quaid: yes, we will :) 15:10:41 dneary: quaid guys if you are lack of time , I think that we can handle it 15:10:57 +1 thanks 15:11:01 dcaro: can you take this task ? 15:11:03 this new project is eating me up 15:11:25 quaid: if only you could just tell us what is and we could help ;) 15:11:51 quaid: just update the ticket with the info , or email will be great ... 15:11:54 ewoud: you know how companies are, they love the Big Reveal :) 15:11:59 knesenko: will do 15:12:04 knesenko: ok 15:12:09 dcaro: thanks a lot 15:12:30 * knesenko moving the ticket to dcaro 15:12:49 quaid: please update the ticket + talk to dcaro if he needs any info to finish this task 15:13:01 https://fedorahosted.org/ovirt/ticket/53. 15:13:14 eedri: ^^ ? 15:13:35 eedri: want to handle it ? 15:14:03 knesenko, sure. should be easy now that we have backuips 15:14:32 *clash* 15:14:35 good 15:14:36 thanks 15:15:04 * quaid puts in ssl notes 15:15:21 https://fedorahosted.org/ovirt/ticket/10 15:15:58 its already assigned to dneary quaid .... moving on 15:16:07 It's assigned to me? 15:16:14 https://fedorahosted.org/ovirt/ticket/41 15:16:30 dneary: to quaid 15:16:38 dneary: you reported 15:16:42 https://fedorahosted.org/ovirt/ticket/41 15:16:42 Yup 15:16:57 41 i think is pending dcaro patches to be merged 15:17:06 aha ... 15:17:13 so I will assign it to dcaro then 15:17:14 if we can confirm that any open tickets that I own are actually for me/best done by me, I can just schedule to do them -- that's the sort of proj management I'm needing :) 15:17:19 * quaid looks to see what he howns 15:17:23 OK - quaid, we should figure out who will do that list rename. I'm ready to give it a go, but would want to have a fallback plan in place if things go wrong 15:17:42 knesenko: yes please 15:18:28 https://fedorahosted.org/ovirt/ticket/42 15:18:36 * quaid sees he has only one ticket anyway 15:18:48 quaid: seems like this one ^^ is related to SSL ticket right ? 15:19:01 ah-ha, yes 15:19:05 ok 15:19:36 quaid: please update more info on it , so we can handle it from there . thanks 15:19:37 next 15:19:54 dneary: perhaps we can coordinate so I'm available in case you have problems, etc.? i.e. pick a time early in my day when we have crossover of a few hours 15:20:23 quaid: can you handle mailing lists tasks ? can i assign them to you ? 15:20:34 yeah 15:20:42 perhaps this week? 15:20:59 quaid: there are 2 ... 15:21:11 dneary: sure 15:21:26 knesenko: which ones are they? 15:21:34 https://fedorahosted.org/ovirt/ticket/25 15:21:43 https://fedorahosted.org/ovirt/ticket/12 15:21:49 quaid, Tomorrow is looking promising for me 15:22:19 For #25, I think I need quaid's archives - I don't have all of the missing messages 15:24:13 I don't either, and that's probably the biggest catch there 15:24:25 someone sent them to me but I lost that archive, I'm sure someone else has them though 15:24:30 guys seems like we are lack of time . we will continue to review the tickets on our next meeting. 15:24:33 * quaid has them but all in one mailbox 15:25:01 thank you all . Please make sure to handle your tickets (If you have time) 15:25:13 thanks 15:25:21 anything else before I finish this meeting ? 15:25:38 5 15:25:39 3 15:25:41 2 15:25:42 1 15:25:43 ? 15:25:51 ok thank you all ... 15:26:00 have a nice week 15:26:01 #endmeeting